BSL has focused on intelligent search applications for more than twenty years. Search technology, intelligence and performance is continually evolving, which is why we also adopt a range of technologies within our software solutions. This blog outlines some of our experiences with Azure Cognitive Search – a Microsoft service that we will shortly be using to classify thousands of documents each day, delivering highly customised Corporate insights to subscribers.
Intelligent Search vs Dumb Search
Dumb search is what we use almost every day in the Google search bar. So when I recently searched for “Stretch film”, looking for an old movie review, I got a whole range of cling film solutions and vendors. No movies. Disappointment.😔 However, as default, Google is just looking for word matches with no other context. But if I click in the search bar, I see a list of actors who appear in the film for which I was searching. So Google uses a different algorithm to populate their “People also search for” list. This part of Google has clearly understood the context. Google Search capabilities mean that Google understands that “film” can also refer to movies and so returns what it knows about the movie “Stretch”. So not so dumb after all.
Helping Google to search better
You can give Google hints. For example, searching for “Film Stretch” still finds some cling film ads (because companies are using these words in their Ads campaigns), but Google gives the word “Film” more priority, as it is the first word. And so it shows the film in their custom results block. It also returns Wikipedia and IMDB matches. “Movie Stretch” is even better, as it removes all the confusion. Unlike the word “Film”, the meaning of “Movie” is unambiguous. With a bit of effort, you can persuade Google to look for particular content and even tell it where to look, but very few users spend any time learning Google syntax. If you’re interested, this cheat sheet may help you to learn more about Google search capabilities. Just using elementary operators such as AND/OR, quotation marks, and parentheses will help you use Google better.
Intelligent search in business apps
Context is King in business applications. Users expect a degree of intelligence that you won’t find when you just type a few words into Google’s search bar. Azure Cognitive Search is a cloud search service with built-in AI search capabilities. It can automatically enrich information, helping you identify and explore relevant content. And it can work with unlimited quantities of data. Developers can assign cognitive skills for vision, language, and speech or use custom machine learning to understand your content. Azure Cognitive Search also provides semantic search capabilities. It uses advanced machine learning to understand user intent and contextually rank the most relevant search results.
Making your content easily accessible
We regularly work with clients with hundreds of thousands of documents, sometimes adding thousands of documents each day. Some of these documents can be simple text, HTML or XML files, but we frequently work with many different sources, including Word documents, Excel spreadsheets, PowerPoint decks or PDF’s. Azure Cognitive Services can handle these and many more formats. It can also use OCR to recognise and extract text from images. As we import all these documents, we index the contents, creating a searchable archive. Users can easily search for information using complex or straightforward queries to find matches using Booleans, wildcards, stemming, and other language features.
Extracting meaning and enriching content using “Skills”
Skills are Azure Cognitive Search services that extract content, structure and meaning from raw unstructured text and image files. For example, sentiment analysis can identify whether a document is positive, negative, or just neutral. Other “skills” can identify the source language and automatically translate all the contents to English or another language. There are more than 15 of these skills to choose from – and you can even harness the power of Cognitive Search AI to develop highly customised skills that you create specifically for your client. Three of the most valuable and flexible skills available to developers are “Entity Recognition”, “Personal Information”, and “Key Phrases”.
The Entity Recognition skill extracts entities of different types from the text. These entities fall under 14 distinct categories, ranging from people (names) and organisations to URLs and phone numbers. Others include Dates and Times, addresses, email addresses and quantities. You can automatically extract these items from documents or text and use them for searches or as metadata to classify the contents. So, for example, quickly tag all documents that refer to ABN AMRO or Coca Cola.
GDPR requires specific care when creating datasets. In most cases, you need to protect the privacy of individuals by hiding their personal details. In addition, the storage and use of personal information are subject to ever-increasing control. Using Personal Information “skills”, you can automatically detect personal information such as social security numbers, names, and addresses and automatically mask the information before storing the document contents.
The Key Phrase Extraction skill evaluates unstructured text, and for each record, returns a list of key phrases. This skill uses the Key Phrases machine learning models supplied within Cognitive Search. This capability is beneficial if you need to identify the main talking points within the contents. For example, given input text “The food was delicious, and there were wonderful staff”, the service returns “food” and “wonderful staff”.
These are only a few examples of Cognitive Search’s flexibility. We can use Azure Cognitive Search services to rapidly develop completely custom systems that will import your content and automatically apply AI and Learning Models to enrich your content. And at the same time, honour GDPR commitments and privacy. So why not contact BSL to discuss how we can help you create an intelligent datastore within your organisation. We can build prototypes in just a few weeks.