Extract valuable knowledge from your data.
Explore new business opportunities with text analytics.
Below are a few use cases, inspired by solutions I successfully completed for my customers. The underlying theme is always natural language processing, machine learning and big data processing. If you’d like a quote for a particular solution, or want to suggest your own, get in touch.
I also offer trainings, architectural reviews and design validation in the areas of data analysis and machine learning.
Build me a custom search solution.
I can design and implement a custom indexing service for your business:- build a fully hosted, self-sufficient solution: your sensitive data never leaves the house
- tailor the search exactly to your needs—including which entities and relationships are important to you, a custom relevancy ranking, and the way the data gets updated for the freshest results
- a scalable design and good API for large datasets, for when “let’s stuff everything into MySQL and query with %LIKE%” no longer cuts it
- build upon powerful, proven tools, such as Apache Solr and Elasticsearch—rest assured your pipeline is optimized, minimizing query latency as well as hardware and maintenance costs
- see the Naviga project for a reference
Help me understand my data.
I offer a full package of natural language processing techniques, serving various goals such as:- extract organizations, people and events from unstructured text (named entity recognition)
- find salient themes and categorize your documents according to those themes—automatically, without having to tag data manually
- browse related documents, with “relatedness” defined by your business needs, including semantic similarity
- route your support requests—let the machine analyze a piece of text and its tone (sentiment analysis, language detection)
- intelligent spelling correction, both for your internal use and as a service offered to your users
- see the Seznam and EuDML projects for references
Help me collect and process my data.
So, you don’t have the necessary data yet? Want to bootstrap your business with public databases, directories and listings (such as yelp or foursquare)? I can help you connect to existing services and data:- construct a pipeline that will efficiently crawl, process and update data
- pipe in either your internal database or harvest from external sources (such as the “world wild web”)
- extract meaningful text and entities from web pages, PDF files, OCR scans
- automate categorizing and tagging of text data according to your needs
- display statistics and actionable analytics from processed data
- normalize data and match equivalent products
- see the Naviga and gensim project
Help my users make the most of my business.
Increasing revenue through better click-through rate and higher conversions is the evergreen challenge of every online business. I offer my experience in design and implementation of state-of-the-art algorithms for:- ad optimization & pricing: let the real data (=user traffic) decide which ads to display, when, to whom, for how much
- algorithms that learn from experience and improve with time (split testing, multiarm bandits)
- systems that deal gracefully with data sparsity or lack of context (intelligent data aggregation, user profiling)
- exploit users’ buying patterns to offer real-time item recommendations: machine learning goes well beyond the trivial “you bought X, here’s accessories for X”, and improves with time
- real-time detection of unusual events (outlier/fraud detection)
- check out the Sklik and gensim projects