Perl short text classifier to guess person's ethnicity from their name

Ολοκληρωμένο Αναρτήθηκε Πριν 6 χρόνια Πληρώθηκε κατά την παράδοση
Ολοκληρωμένο Πληρώθηκε κατά την παράδοση

Your job is to create a simple, short string classifier in Perl. The input to the system is a person's name, thus an UTF-8 encoded string usually between 10-40 characters in length, and the system will classify to which of the pre-defined classes the string belongs to. The classes are ethnicity groups.

For example, if input to the system is "John Smith", the system would output class "English", or if the input is "hiromi akiyama" the system would output "Japanese". There are 18 different classes (ethnicity groups).

The system has two parts: 1) Training script called [url removed, login to view] which trains the system using given training data (list of known 'name = ethnicity' pairs) and saves the "trained state" of the system to disk. The script is called by "perl [url removed, login to view] [url removed, login to view]".

2) Analyzer script [url removed, login to view] which loads the "trained state" from disk (generated by the training script previously), and uses the loaded data to classify to which class a given string belongs to. The script is called by "perl [url removed, login to view] [url removed, login to view]" in which case it will load the given test file, OR as in "perl [url removed, login to view] "john smith"" in which case it would simply analyze (classify) the given string from the command line ("john smith" in this case).

Attached is data.zip. It contains [url removed, login to view] and testing_data.txt. The data is in format of "name:class" where the name is base64 encoded.

Your system must be able to be trained using the given [url removed, login to view] in a way it analyzes [url removed, login to view] with 90% or better accuracy.

Notice: The solution must be some kind of training based solution. For example, a bayesian classifier, ngram analyzer or artificial intelligence or machine learning of some sort. The solution must not be based on any regular expressions or fixed (human written) set of detection rules.

You are free to use any existing free Perl code, libraries and modules, such as AI or data classifier libraries.

Perl

Ταυτότητα Εργασίας: #16538057

Σχετικά με την εργασία

5 προτάσεις Απομακρυσμένη εργασία Ενεργό Πριν 6 χρόνια

Ανατέθηκε στον:

kchwistek

Hi I am quite experienced programmer knowing several programming languages. Your project is interesting. In past I have studied Computer Science and the AI topic is something what I like to think about. Unfortunately t Περισσότερα

$165 USD σε 10 μέρες
(2 Αξιολογήσεις)
3.5

5 freelancers κάνουν προσφορές κατά μέσο όρο $156 για αυτή τη δουλειά

freelance4hire80

hi, I've checked the project spec. I can come out a perl script for you by using Algorithm::NaiveBayes for example to predict the person's ethnicity using the training data sets

$155 USD σε 3 μέρες
(49 Αξιολογήσεις)
6.6
balu0priya1

I would like to work on this project as I have enough experience in perl .... If interested please let's know.

$155 USD σε 3 μέρες
(0 Αξιολογήσεις)
0.0