Resume Building and Screening
Using ML
Name: Soumya S Gidaganti
Department of Master of Computer Applications
Jain College of Engineering (JCE), Belagavi
2024-2025
Abstract
In the current smart world, everything should be done faster, smarter, and
accurate way. The various organization's recruitment processes will be done face-
to-face in an arranged venue. But, during some pandemics like Covid-19 face to face
recruitment process will be very difficult. In the proposed system, a smarter way of
performing the recruitment processes anywhere around the world based on the
company requirements is performed.
The aim of this project deals with making the process of candidate recruitment
easier for companies. The amount of manual work that goes into recruiting
processes is reduced and the initial scanning process of candidates is performed. By
eliminating the redundant candidates helps in retaining only the applicable ones.
Achieve this through the help of resume scanning, initial aptitude testing of
candidates, and an interview session where the candidate answers questions asked
by the interviewer. With this model, all the time and manual labour that is wasted
in eliminating redundant candidates is accomplished. It chooses the one who is best
applicable to a job by comparing it with the job description based on the resumes
received. Our model is working accurately for some of the predefined parameters
of the company in a recruitment process by providing more security and reliability.
Introduction
Purpose: A typical job posting on the Internet receives a massive number of
applications within a short window of time. Manually filtering out the resumes is not
practically possible as it takes a lot of time and incurs huge costs that the hiring
companies cannot afford to bear. In addition, this process of screening resumes is not
fair as many suitable profiles don’t get enough consideration which they deserve. This
may result in missing out on the right candidates or selection of unsuitable applicants
for the job. In this paper, we describe a solution that aims to solve these issues by
automatically suggesting the most appropriate candidates according to the given job
description.
Our system uses Natural Language Processing to extract relevant information like
skills, education, experience, etc. from the unstructured resumes and hence creates
a summarised form of each application. With all the irrelevant information removed,
the task of screening is simplified and recruiters can better analyze each resume in
less time. After this text-mining process is completed, the proposed solution employs
a vectorization model and uses cosine similarity to match each resume with the job
description. The calculated ranking scores can then be utilized to determine the best-
fitting candidates for that particular job opening.
Project Scope:
• Separating the right candidates from the pack
• Making sense of candidate CVs
• Filtering the candidates based on their performance.
Product Features:
• You do not need to copy PDF text information from hundreds of PDF files again.
Using it, you can batch-process PDF Data one time.
• Create one single Excel, CSV, or XML file from all PDF files.
• It is a single standalone program which is free.
• Make any change to text or images in a PDF without losing formatting.
• User-friendly and simple steps that are easily operatable.
System Analysis
After an extensive analysis of the problems in the system, we are familiar with the
requirements that the current system needs. The requirement that the system needs
is categorized into the Software requirements and Hardware requirements. These
requirements are listed below:
Software Requirements:
Operating System: Windows 8/8.1/10
IDE: The Jupyter Notebook
• The notebook extends the console-based approach to interactive computing in a
qualitatively new direction, providing a web-based application suitable for capturing
the whole computation process: developing, documenting, and executing code, as
well as communicating the results.
• Notebook documents contain the inputs and outputs of an interactive session as
well as additional text that accompanies the code but is not meant for execution. In
this way, notebook files can serve as a complete computational record of a session,
interleaving executable code with explanatory text, mathematics, and rich
representations of resulting objects.
• JupyterLab also offers a unified model for viewing and handling data formats.
JupyterLab understands many file formats (images, CSV, JSON, Markdown, PDF, Vega,
Vega-Lite, etc.) and can also display rich kernel output in these formats. See File and
Output Formats for more information.
• JupyterLab extensions can customize or enhance any part of JupyterLab, including
new themes, file editors, and custom components.
• JupyterLab is served from the same server and uses the same notebook document
format as the classic Jupyter Notebook.
Language Used: Python
• Python is an interpreted high-level general-purpose programming language.
Python's design philosophy emphasizes code readability with its notable use of
significant indentation. Its language constructs as well as its object-oriented approach
aim to help programmers write clear, logical code for small and largescale projects.
• Python is dynamically-typed and garbage-collected. It supports multiple
programming paradigms, including structured (particularly, procedural),
objectoriented and functional programming. Python is often described as a "batteries
included" language due to its comprehensive standard library.
Libraries Used: Pandas, Numpy, Matplotlib, Sklearn, Seaborn, RE, Nltk, Wordcloud
Hardware Requirements
Processor: Intel i3 or Above
Clock Speed: 2GHZ
System Bus: 64bit
Ram: 4GB or more
Monitor: LCD
Keyboard: QWERTY (101keys)