Hi, I'm Yogendra Verma.
A
Resourceful Data Scientist with expertise in Python, machine learning, deep learning, and NLP, skilled at unraveling complex challenges with analytical finesse and innovation
About
As a Data Scientist, I thrive on problem-solving and coding, leveraging a diverse skill set spanning Python, Machine Learning, Deep Learning, NLP, Computer Vision, Image Processing, and Cloud technologies such as AWS and GCP. My passion lies in developing AI/ML applications and conducting Quantitative & Qualitative Analytics to derive valuable insights, addressing real-world challenges with data-driven solutions.
- Languages: Python,Embedded C,
- Databases: MySQL
- Libraries: NumPy, Pandas, OpenCV,, Keras, Tensorflow , Sklearn, Numpy, Matplotlib, RegEx, NLTK, Gensim, SpaCy, Plotly
- Frameworks: Flask, Keras, TensorFlow,PyTorch
- Tools & Technologies: Git, Docker, AWS, K8s Cluster, GCP, Heroku, MATLAB,
Always seeking to work with reputed organization where I could get an opportunity to grow my skills and I can upskill the organization’s productivity using my information & Knowledge.
Experience
- Design and develop an AIML-based Auto Adjudication System to automate the medical claim process to make the system fast and efficient and also reduced the manual work.
- Designing and deploying APIs with CI/CD pipelines and Docker for scalability and Docker building for efficient containerization and application deployment.
- AWS deployments, managing applications, and optimizing solutions using K8s clusters to enhance performance, capable of handling 100,000 documents per hour
- Tools: AIML, Deep learning, Computer Vision, Tesseract, Python, API Design, Flask, Dcoker, K8 cluster, AWS, Documents Classifcation, OpenAI
- Design and develop an e-billing assistant machine learning model which can automatically identify or flag the bills that have the potential to be rejected.
- Design and develop an automation bill remediation system for the rejected lines Items In the e-bills to quantify the current rejection rate and help to reduce the rejection ratio.
- Building an AIMl based model to identifying the fraud while processing the bills by the leagal firms to its customers. Also it is is removing maximin human effort.
- Tools: Python, OpenCV, Keras, Tensorflow, PyTorch, BERT, Word Embeddings, Text Classifcation, AWS textract, Pandas Profiling, SpaCy, NLTK, Gensim,SHAP,Lime, Seaborn
- Designing and constructing a document classification system to automate the loan underwriting process.
- Clustering documents from comprehensive loan packages, comprising over 1500 pages and spanning 30+ categories concurrently.
- Extracting data from scanned PDF bank documents, including tables and images, to expedite the loan application underwriting process.
- Tools: Python, Layout Detection, Documents Classification, BERT, Random forest, CNN, RNN, RCNN, LSTM,Pandas, NumPy, Image Processing, OpenCV, Computer Vision, Camelot,AWS textract,NER,paCy, NLTK, Gensim,SHAP,Lime, Seaborn, Univariate and multi-variate Analysis
- Tools: Machine Learning, Python, Pandas, Sklearn, Natural language processing, NLTK, RegEx, Stemming, Lemmatization, CountVectorizer, TF-IDF, WordCloud, Selenium web scraping, Classification Models, Sentiment Analysis, Topic Modeling, Text Summarization, Selenium, Huggingface
- Design and building an Automation system based on AIML to Support Process and costomer Reviews Sentiment Analysis and Reviews Summarization, YOLO
- Tools: IOT, Raspberry Pi, NodMCU AWS, MQTT, Node Red, Pub Sub, http, html,php, webhook
- Design and develop IOT Based Smart home Automation To control appliances at fingertips from anywhere in the world
- Additionally, developing both major and minor academic Embedded Systems and IoT projects
- Tools: Embedded System , Embedded C, Raspberry Pi, USART, MQTT ,Microcontrollers, Microprocessors, Proteus Tool, Raspberry Pi OS
- Design and engineer Embedded Systems for specialized tasks and Smart Automation applications.
Projects
![Auto Adjudication System](/assets/img/project-quizup-logo-1.png)
An AIML based model to classify medical documents
- Tools: AIML, Deep learning, Computer Vision, Tesseract, Python, API Design, Flask, Dcoker, K8 cluster, AWS, Documents Classifcation
- AIML-based Auto Adjudication System to automate the medical claim process to make the system fast and efficient and also reduced the manual work.
- Designing and deploying APIs with CI/CD pipelines and Docker for scalability and Docker building for efficient containerization and application deployment.
- AWS deployments, managing applications, and optimizing solition using K8s cluster to skill up the performance.
- Tested the model on the real documents and got 95% overall accuracy on the evaluation metric.
![documents quality](/assets/img/project-quizup-logo-1.png)
To calculate Document Quality Index using Computer Vision
- Tools: Python, OpenCV, PyTorch, Computer Vision, Image Processing
- Calculate Document Quality Index using Computer Vision
- Transforming a document quality solution into an API.
![E-billing Assistant](/assets/img/project-blog-logo.jpg)
An AIML based E-billing Assistant Automation System
- Tools: Python, OpenCV, Keras, Tensorflow, PyTorch, BERT, Word Embeddings, Text Classifcation, AWS textract
- An AIML model to flag potential bill rejections automatically.
- A system to quantify rejected items in e-bills and reduce rejection rates.
- An AIML model to detect fraud in legal firms' bill processing, minimizing human effort.
![remediation system](/assets/img/project-aim_bert-bias.png)
E-bill remediation system to quantify rejection rates.
- Tools: Python, OpenCV, Keras, Tensorflow, PyTorch, BERT, Word Embeddings, Multi-label Text Classifcation, AWS textract
- An automated bill remediation system to quantify rejection rates and reduce ratios.
- A solution to assess rejected items in e-bills, aiming to give reasion of rejection
- An AI-powered fraud detection system for legal bill processing, streamlining operations.
![Screenshot of web app](/assets/img/computer-vision-v2-04.png)
Document Classification Solution for Scanned pdf files for Loan application
- Tools: Python, Documents Classification, BERT, Random forest, CNN, RNN, RCNN, LSTM,Pandas, NumPy, Image Processing, OpenCV,AWS textract
- Designing and constructing a document classification system to automate the loan underwriting process.
- Clustering documents from comprehensive loan packages, comprising over 1500 pages and spanning 30+ categories concurrently.
- Tested the model on the real scanned documents and got 90% overall accuracy on the evaluation metric.
![Screenshot of web app](/assets/img/gan.jpg)
A Documents Data Extractor to get document's data
- Tools: Python, Layout Detection CNN, RNN,RCNN, LSTM,Pandas, NumPy, Image Processing, OpenCV, Computer Vision, Camelot,AWS textract
- To extract data from PDF scanned bank documents using Deep Learning
- Extracting data from scanned PDF bank documents, including tables and images, to expedite the loan application underwriting process.
Skills
Languages and Databases
![](/assets/img/python-logo-1-300x300.jpg)
![](/assets/img/mysql-logo-1-300x300.jpg)
![](/assets/img/embedded-logo.png)
Libraries
![](/assets/img/numpy-logo-1-500x500.jpg)
![](/assets/img/pandas-logo-2-500x500.jpg)
![](/assets/img/opencv-logo-1-500x500.jpg)
![](/assets/img/sk-learn-logo-1-500x500.jpg)
![](/assets/img/matplotlib-logo-1-500x500.jpg)
![](/assets/img/Plotly-logo.png)
Frameworks
![](/assets/img/flask-logo.png)
![](/assets/img/keras-logo.png)
![](/assets/img/tensorflow-logo-1.png)
![](/assets/img/pytorch-logo.png)
![](/assets/img/matlab-logo.jpg)
![](/assets/img/hf-logo.png)
Other
![](/assets/img/git.png)
![](/assets/img/aws.png)
![](/assets/img/heroku.png)
![](/assets/img/docker.png)
![](/assets/img/kubernetes-logo.png)
Education
Orissa, India
Degree: Master of Technology in Computer Science and Engineering
CGPA: 8.12/10.0
- Internet of Things
- Artificial Neural Networks Modeling
- Foundations of Algorithms
- Computer Vision
- Machine Learning
- Deep Learning
Relevant Courseworks:
Dehradun, India
Degree: Bachelor of Technology in Electronics and Communication Engineering
CGPA: 6.2/10.0
- Digital Image Processing
- Embedded System and Robotics
- Digital Signal Processing
- C and Embedded C Programing
- MATLAB
Relevant Courseworks: