Open in app
Home
Notifications
Lists
Stories

Write
Marie Stephen Leo
Marie Stephen Leo

Home
About

Published in Towards AI

·Pinned

Make Extra Money on the Side With Data Science!

Monetize your Machine Learning models using FastAPI, Docker, GCP Cloud Run, and Rapid API in 5 easy steps! — ️The dream of every FIRE (Financial Independence, Retire Early) enthusiast is to earn enough passive income to retire early and sip coconuts on a tropical beach 🏖️ ! In my pursuit of FIRE, I stumbled upon a neat way to monetize my Data Science side projects by deploying models as…

Data Science

6 min read

Make Extra Money on the Side with Data Science
Make Extra Money on the Side with Data Science

Published in Towards Data Science

·Pinned

Semantic Textual Similarity

From Jaccard to OpenAI, implement the best NLP algorithm for your semantic textual similarity projects — 🎬 Introduction Natural Language Processing (NLP) has tremendous real-world applications in information extraction, natural language understanding, and natural language generation. Comparing the similarity between natural language texts is essential to many information extraction applications such as Google search, Spotify’s Podcast search, Home Depot’s product search, etc. The semantic textual similarity (STS) problem…

Machine Learning

16 min read

Semantic Textual Similarity
Semantic Textual Similarity

Published in Vector Database for AI

·Pinned

Supercharged Semantic Similarity Search in Production

Blazing Fast, Highly Scalable Text-to-Image Search with CLIP embeddings and Milvus — Introduction I introduced three simple methods to convert images into embeddings for similarity search applications using state-of-the-art neural networks in my previous post. In this post, let's discuss how we can use those embeddings together with Milvus, one of the most popular open-source vector search databases, to create a production scale…

Artificial Intelligence

9 min read

Supercharged Semantic Similarity Search in Production
Supercharged Semantic Similarity Search in Production

Published in Towards AI

·Pinned

KNN (K-Nearest Neighbors) is Dead!

Long live ANNs for their whopping 380X speedup over sklearn’s KNN while delivering 99.3% similar results. — We’re living through an extinction-level event. No, not COVID19. I’m talking about the demise of the popular KNN algorithm that is taught in pretty much every Data Science course! Read on to find out what’s replacing this staple in every Data Scientists' toolkit. KNN Background Finding “K” similar items to any given…

Machine Learning

7 min read

KNN (K-Nearest Neighbors) is Dead!
KNN (K-Nearest Neighbors) is Dead!

Published in Towards Data Science

·Pinned

Powering Semantic Similarity Search in Computer Vision with State of the Art Embeddings

Easiest ways to perform image-to-image and text-to-image similarity search — A whopping 90% of data created since the dawn of human civilization was produced in the past two years! The rate of data creation continues to increase with the proliferation of digital technologies such as social media and the internet of things (IoT) together with ever-faster wireless communication technologies such…

Artificial Intelligence

17 min read

Powering Semantic Similarity Search in Computer Vision with State of the Art Embeddings
Powering Semantic Similarity Search in Computer Vision with State of the Art Embeddings

Published in Towards AI

·May 17

No Training Data? No Problem! Weak Supervision to the Rescue!

Use domain knowledge to generate large labeled datasets with state-of-the-art NLP Weak Supervision. — 🚧 The challenge of contemporary Machine Learning One of the major bottlenecks for developing modern machine learning (ML) models in real-world applications is the need for substantial amounts of manually-labeled training data [Paper]. For example, the ImageNet dataset consists of over 14Million manually labeled images of various real-world objects. …

Artificial Intelligence

8 min read

No Training Data? No Problem! Weak Supervision to the Rescue!
No Training Data? No Problem! Weak Supervision to the Rescue!

Published in Towards Data Science

·Sep 9, 2021

Boy or Girl? A Machine Learning Web App to Detect Gender from Name

Find out a name’s likely gender using Natural Language Processing in Tensorflow, Plotly Dash, and Heroku. — Choosing a name for your child is one of the most stressful decisions you’ll have to make as a new parent. Especially for a data-driven guy like me, having to decide on a name without any prior data about my child’s character and preferences is a nightmare come true! Since…

Artificial Intelligence

7 min read

Boy or Girl? A Machine Learning Web App to Detect Gender from Name
Boy or Girl? A Machine Learning Web App to Detect Gender from Name

Published in Towards Data Science

·Jul 21, 2021

Have a SQL Interview Coming Up? Ace It Using Google Colab!

Setup & Run SQL in Google Colab with just 2 helper functions! — Coding tests are pretty much standard in Data Science interview processes these days. As a Data Science hiring manager, I find a 20–30 min live coding test with some prepared tasks to be effective at identifying candidates who would be successful in the roles that I typically hire for. Google…

Data Science

5 min read

Have a SQL Interview Coming Up? Ace It Using Google Colab!
Have a SQL Interview Coming Up? Ace It Using Google Colab!

Published in Towards Data Science

·Jul 15, 2021

Stop Using Print! Python Logging for Data Scientists

~80% of what you need to know about logging in under 5 mins — There comes a time in every production Data Science project when the code base has become complex, and a refactor is necessary to maintain your sanity. Perhaps you want to abstract out commonly used code into Python modules with classes and functions so that it can be reused with a…

Python

4 min read

Stop Using Print! Python Logging for Data Scientists
Stop Using Print! Python Logging for Data Scientists

Jun 20, 2021

Yikes! My Data Science Medium Story Got Plagiarised!

What you can do when someone blatantly plagiarises your work on Medium — Publishing Data Science stories on Medium is hard work. It takes weeks (even months) to research interesting topics, architect code in the simplest way possible, and weave it all into an engaging story. …

Medium

4 min read

Yikes! My Data Science Medium Story Got Plagiarised!
Yikes! My Data Science Medium Story Got Plagiarised!
Marie Stephen Leo

Marie Stephen Leo

Director of Data Science | NLP | ML at scale | GCP | AWS | linkedin.com/in/marie-stephen-leo

Following
  • Jesus Rodriguez

    Jesus Rodriguez

  • TDS Editors

    TDS Editors

  • Prakhar Mishra

    Prakhar Mishra

  • Milvus

    Milvus

  • Shubham Saboo

    Shubham Saboo

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Knowable