Hi, I'm Yogendra Verma.

A
Resourceful Data Scientist with expertise in Python, machine learning, deep learning, and NLP, skilled at unraveling complex challenges with analytical finesse and innovation

About

As a Data Scientist, I thrive on problem-solving and coding, leveraging a diverse skill set spanning Python, Machine Learning, Deep Learning, NLP, Computer Vision, Image Processing, and Cloud technologies such as AWS and GCP. My passion lies in developing AI/ML applications and conducting Quantitative & Qualitative Analytics to derive valuable insights, addressing real-world challenges with data-driven solutions.

  • Languages: Python,Embedded C,
  • Databases: MySQL
  • Libraries: NumPy, Pandas, OpenCV,, Keras, Tensorflow , Sklearn, Numpy, Matplotlib, RegEx, NLTK, Gensim, SpaCy, Plotly
  • Frameworks: Flask, Keras, TensorFlow,PyTorch
  • Tools & Technologies: Git, Docker, AWS, K8s Cluster, GCP, Heroku, MATLAB,

Always seeking to work with reputed organization where I could get an opportunity to grow my skills and I can upskill the organization’s productivity using my information & Knowledge.

Experience

Sr. Data Scientist
  • Design and develop an AIML-based Auto Adjudication System to automate the medical claim process to make the system fast and efficient and also reduced the manual work.
  • Designing and deploying APIs with CI/CD pipelines and Docker for scalability and Docker building for efficient containerization and application deployment.
  • AWS deployments, managing applications, and optimizing solutions using K8s clusters to enhance performance, capable of handling 100,000 documents per hour
  • Tools: AIML, Deep learning, Computer Vision, Tesseract, Python, API Design, Flask, Dcoker, K8 cluster, AWS, Documents Classifcation, OpenAI
March 2023 - Present | Hyderabad, India
Sr. Data Scientist
  • Design and develop an e-billing assistant machine learning model which can automatically identify or flag the bills that have the potential to be rejected.
  • Design and develop an automation bill remediation system for the rejected lines Items In the e-bills to quantify the current rejection rate and help to reduce the rejection ratio.
  • Building an AIMl based model to identifying the fraud while processing the bills by the leagal firms to its customers. Also it is is removing maximin human effort.
  • Tools: Python, OpenCV, Keras, Tensorflow, PyTorch, BERT, Word Embeddings, Text Classifcation, AWS textract, Pandas Profiling, SpaCy, NLTK, Gensim,SHAP,Lime, Seaborn
June 2022 - Feb 2023 | Pune, India
NLP Platform Engineer
  • Designing and constructing a document classification system to automate the loan underwriting process.
  • Clustering documents from comprehensive loan packages, comprising over 1500 pages and spanning 30+ categories concurrently.
  • Extracting data from scanned PDF bank documents, including tables and images, to expedite the loan application underwriting process.
  • Tools: Python, Layout Detection, Documents Classification, BERT, Random forest, CNN, RNN, RCNN, LSTM,Pandas, NumPy, Image Processing, OpenCV, Computer Vision, Camelot,AWS textract,NER,paCy, NLTK, Gensim,SHAP,Lime, Seaborn, Univariate and multi-variate Analysis
Dec 2020 - Dec 2021 | Gurugram, India
Data Scientist
  • Tools: Machine Learning, Python, Pandas, Sklearn, Natural language processing, NLTK, RegEx, Stemming, Lemmatization, CountVectorizer, TF-IDF, WordCloud, Selenium web scraping, Classification Models, Sentiment Analysis, Topic Modeling, Text Summarization, Selenium, Huggingface
  • Design and building an Automation system based on AIML to Support Process and costomer Reviews Sentiment Analysis and Reviews Summarization, YOLO
Aug 2018 - Nov 2020 | Delhi, India
Embedded System Engineer
  • Tools: IOT, Raspberry Pi, NodMCU AWS, MQTT, Node Red, Pub Sub, http, html,php, webhook
  • Design and develop IOT Based Smart home Automation To control appliances at fingertips from anywhere in the world
  • Additionally, developing both major and minor academic Embedded Systems and IoT projects
May 2014 - Jan 2016 | Delhi, India
Research Engineer
  • Tools: Embedded System , Embedded C, Raspberry Pi, USART, MQTT ,Microcontrollers, Microprocessors, Proteus Tool, Raspberry Pi OS
  • Design and engineer Embedded Systems for specialized tasks and Smart Automation applications.
May 2012 - July 2013 | Delhi, India

Projects

Auto Adjudication System
Auto Adjudication System

An AIML based model to classify medical documents

Accomplishments
  • Tools: AIML, Deep learning, Computer Vision, Tesseract, Python, API Design, Flask, Dcoker, K8 cluster, AWS, Documents Classifcation
  • AIML-based Auto Adjudication System to automate the medical claim process to make the system fast and efficient and also reduced the manual work.
  • Designing and deploying APIs with CI/CD pipelines and Docker for scalability and Docker building for efficient containerization and application deployment.
  • AWS deployments, managing applications, and optimizing solition using K8s cluster to skill up the performance.
  • Tested the model on the real documents and got 95% overall accuracy on the evaluation metric.
documents quality
Document Quality Index Calculator

To calculate Document Quality Index using Computer Vision

Accomplishments
  • Tools: Python, OpenCV, PyTorch, Computer Vision, Image Processing
  • Calculate Document Quality Index using Computer Vision
  • Transforming a document quality solution into an API.
E-billing Assistant
E-billing Assistant Automation System

An AIML based E-billing Assistant Automation System

Accomplishments
  • Tools: Python, OpenCV, Keras, Tensorflow, PyTorch, BERT, Word Embeddings, Text Classifcation, AWS textract
  • An AIML model to flag potential bill rejections automatically.
  • A system to quantify rejected items in e-bills and reduce rejection rates.
  • An AIML model to detect fraud in legal firms' bill processing, minimizing human effort.
remediation system
E-bill remediation system

E-bill remediation system to quantify rejection rates.

Accomplishments
  • Tools: Python, OpenCV, Keras, Tensorflow, PyTorch, BERT, Word Embeddings, Multi-label Text Classifcation, AWS textract
  • An automated bill remediation system to quantify rejection rates and reduce ratios.
  • A solution to assess rejected items in e-bills, aiming to give reasion of rejection
  • An AI-powered fraud detection system for legal bill processing, streamlining operations.
Screenshot of  web app
Classification Solution for Scanned Documents

Document Classification Solution for Scanned pdf files for Loan application

Accomplishments
  • Tools: Python, Documents Classification, BERT, Random forest, CNN, RNN, RCNN, LSTM,Pandas, NumPy, Image Processing, OpenCV,AWS textract
  • Designing and constructing a document classification system to automate the loan underwriting process.
  • Clustering documents from comprehensive loan packages, comprising over 1500 pages and spanning 30+ categories concurrently.
  • Tested the model on the real scanned documents and got 90% overall accuracy on the evaluation metric.
Screenshot of  web app
Scanned Documents Data Extraction

A Documents Data Extractor to get document's data

Accomplishments
  • Tools: Python, Layout Detection CNN, RNN,RCNN, LSTM,Pandas, NumPy, Image Processing, OpenCV, Computer Vision, Camelot,AWS textract
  • To extract data from PDF scanned bank documents using Deep Learning
  • Extracting data from scanned PDF bank documents, including tables and images, to expedite the loan application underwriting process.

Skills

Languages and Databases

Python
MySQL
Embedded C

Libraries

NumPy
Pandas
OpenCV
scikit-learn
matplotlib
Plotly

Frameworks

Flask
Keras
TensorFlow
PyTorch
MATLAB
Hugging Face

Other

Git
AWS
Heroku
Docker
Kubernetes

Education

BPUT University

Orissa, India

Degree: Master of Technology in Computer Science and Engineering
CGPA: 8.12/10.0

    Relevant Courseworks:

    • Internet of Things
    • Artificial Neural Networks Modeling
    • Foundations of Algorithms
    • Computer Vision
    • Machine Learning
    • Deep Learning

ICFAI University

Dehradun, India

Degree: Bachelor of Technology in Electronics and Communication Engineering
CGPA: 6.2/10.0

    Relevant Courseworks:

    • Digital Image Processing
    • Embedded System and Robotics
    • Digital Signal Processing
    • C and Embedded C Programing
    • MATLAB

Contact