Feature engineering, or the process of using domain knowledge to extract features from data, is essential to tuning AI and machine learning performance. It’s also typically arduous and involves rewriting features before they’re deployed. Often, a missing piece is infrastructure that bridges the gap between model training and the serving of AI results in production environments.
That’s why in 2018, Ben Chambers and Davor Bonaci cofounded Kaskada, which uses mining techniques to compute and serve AI features in real time. Today, following the closure of an $8 million funding round, the Seattle, Washington-based startup announced the general availability of its feature engineering platform for individual data scientists and companies, after a period of beta testing with early adopters.
The Kaskada platform doesn’t require setup, and it ingests historical data from data warehouses and data lakes, consuming messages from streaming sources like Apache Kafka and Amazon Kinesis. It transforms this events-based data into a usable format for aggregations and other calculations and lets data scientists write calculations using data from connected sources while visualizing distributions and drilling into outliers.
According to market research firm Tractica, the global AI software market is expected to experience “massive” growth in the coming years, with revenues increasing from $9.5 billion in 2018 to an expected $118.6 billion by 2025. A number of startups are attempting to cash in on the trend — or have already done so — including Determined AI, which recently raised $11 million to further develop its deep learning model development tools for data scientists and AI engineers. Meanwhile, Iguazio nabbed $24 million for its suite of AI development and management tools, and Clusterone raked in $2 million for its DevOps for AI platform that operates with both on-premise servers and public cloud computing platforms like AWS, Azure, and Google Cloud Platform.
Incumbents are also angling to corner a slice of the growing AI and machine learning data prep market. Last December, Amazon introduced SageMaker Data Wrangler, which ostensibly simplifies the process of feature engineering by enabling developers to choose and import the data they want from various stores with a single click.
But Kaskada claims its platform is the first to focus exclusively on the feature engineering and serving experience. To this end, it includes a collaborative interface for data scientists and is powered by proprietary data infrastructure for computing across events-based data and serving features in production. With Kaskada, data scientists can scale, transform, and encode features and view all features for a model within a dashboard. They’re also able to identify which features to export for training, promotion to production, and testing, as well as measuring the performance of features over time and updating production feature versions by calling a feature store API.
“Kaskada’s feature engineering platform is designed to make truly hard data problems in machine learning easy,” Bonaci said. “Data science teams can now work better together, build better features, and deliver results at a whole new level. I cannot wait to see what kind of impact they’ll accomplish in the months and years to come.”
The Kaskada platform is free to start, and data scientists have the option of paying to add additional users, manage more data, and access more features.
The post Kaskada launches platform to prep data for AI models appeared first on Venture Beat.