Spam mail classifier may be a Machine learning primarily based project within which spam mails are detected and prevent the user for unauthorised mails.
In recent times , unwanted industrial bulk emails known as spam has become a large downside on the web. The person causing the spam messages is mentioned because the sender. Such someone gathers emails addresses from completely different websites, chatrooms, and viruses. Spam stop the user from creating full and smart use of your time, storage capability and network information measure. the massive volume of spam mails flowing through the pc network have damaging effects on the memory area of email servers, communication information measure , processor power and user time. The menace of spam email is on the rise on yearly basis and is chargeable for over seventy seven of the total world email traffic.
Keywords
Computer science, system security, system privacy
Analysis of algorithms, Machine learning, Spam filtering, Deep learning
Neural networks, Support vector machines, Naïve bayes.
Outline
EDA (Exploratory data analysis)
Data Pre-processing
Feature Extraction
Scoring & Metrics
Improvement by using Embedding + Neural Network
Comparison of ml algorithmic rule & Deep Learning
libraries that we are going to need for this program
import numpy as np
import pandas as pd
import nltk
from nltk.corpus import stopwords
import string
In e-mail filtering task some options may well be the bag of words or the topic
line analysis. Thus, the input to e-mail classification task will be viewed as a two dimensional
matrix, whose axes are the messages and the options. E-mail classification tasks are typically
divided into many sub-tasks. First, information assortment and illustration are mostly problem-
specific (i.e. e-mail messages), second, e-mail feature choice and have reduction attempt to
reduce the dimensionality (i.e. the range of features) for the remaining steps of the task.
Finally, the e-mail classification part of the method finds the particular mapping between train
In e-mail filtering task some options may well be the bag of words or the topic
line analysis. Thus, the input to e-mail classification task will be viewed as a 2 dimensional
matrix, whose axes are the messages and the options. E-mail classification tasks are typically divided into many sub-tasks. First, information collection and illustration are mostly problem-specific (i.e. e-mail messages), second, e-mail feature choice and have reduction attempt to scale back the spatial property (i.e. the range of features) for the remaining steps of the task.
Finally, the e-mail classification part of the method finds the particular mapping between training