At Snowflake I have been straddling between product engineering and model training. I have been involved with training snowflake's own embedding model series arctic embed v1 and arctic embed v2, and snowflake's text2sql model. This talk is about how at Snowflake we decided when to train a model vs when to take advantage of open source models / partner with frontier model providers. Next I describe briefly the models we have trained and the technical learnings we discovered along the way.
Gaurav Nuti is an engineer at Snowflake working on various products, including Snowflake Intelligence, Cortex Search and Cortex Analyst. He works on bringing research ideas into production systems. His expertise lies in RAG, embedding models, pretraining and text2sql. He is the co-creator for the Arctic Embed family of Models and the Arctic Text2SQL models.