Cortex Feature Downloads
One of the hardest parts of data science and machine learning is determining what features to engineer from the raw input data (see this post for more information). Cortex provides a variety of ways to automatically and manually create features based on large scale real-time event and attribute data.
We’ve received numerous requests from our partners to download features generated by the Machine Learning pipelines they’ve created within Cortex. For example, some use cases include –
- Dashboards (example screenshot below)
- Model understanding
- Feature engineering for additional ML models
We recently launched the ability to download features directly in the UI from pipelines. Given the size of some of the feature files, which can exceed 15 GB at the scale of our partners’ data, we chose to use the parquet format for feature exporting. Parquet files can be loaded into services like Google BigQuery and Amazon Athena for query-based analysis.
In future blog posts we will explore how to use exported features to better understand your pipelines and predictions.
Please let us know if you have any feedback, and enjoy the latest Cortex feature!