Find Jobs
Hire Freelancers

HTML page scraper

$30-100 USD

Σε Εξέλιξη
Αναρτήθηκε περίπου 15 χρόνια πριν

$30-100 USD

Πληρωμή κατά την παράδοση
HTML files residing on a local drive will need to be scraped for data and placed into either a mySQL or SQLite table based upon a definition table. ## Deliverables I need a Delphi 7 application that will scrape data off of HTML files that reside on a local hard drive and place data into either a mySQL table or a SQLite table. The application will have a string constant that points to the location of the HTML files. The application needs to be able to search that location for *.htm files, including in any subfolders that might exist. E.g. const SourcePath : String = 'c:\data\'; If there are any subfolders under c:\data they need to be searched for *.htm files. The data that will be scraped will be defined by a table that will have 3 fields: BeginTag: EndTag: DBField: Each definition/record in this table needs to be applied to each HTM file found in SourcePath. Here's an example of this definition table: BeginTag: **Date:** EndTag: * DBField: THEDATE BeginTag: **ID:** EndTag: * DBField: IDNUMBER BeginTag: Rank: # EndTag: RANKING DBField: in Note: The actual definition table will hold more than just 3 definitions. The app needs to be able to handle all of the definition entries/records it finds. So, here's how the definition table would work. Using the 3 definitions above as an example, we would start with the "**Date:**" BeginTag. The app would search the HTML code in the first file for the first instance of "**Date:**". It would then start storing the data it finds beginning with the next character position after this BeginTag and store the characters/data into a temporary string until it reaches the EndTag, which in this case would be " * ". Whatever temporary string data has been found between the BeginTag and EndTag will be written to a different table (we'll call it the RESULTS DATA table) AFTER all of the definitions have been iterated through. So, the app would move on to the next definition record (BeginTag: **ID:** EndTag: * ) and likewise scrape the data to a temporary string. And then move on to the next definition, etc... Once all the definitions have been iterated through, the scraped data will be written to a record in the RESULTS DATA table. In the example above, 3 strings of data would be written to the fields THEDATE, IDNUMBER and RANKING. The app would then move on to the next HTM file it finds and repeats the scraping of data based upon the definitions, and saves the scraped data to another record in the RESULTS DATA table. And so on... Before writing the scraped data to a record in the results data table, the app will need to check and see if an existing record already resides in the RESULTS DATA table. We don't want duplicate records! The app only needs to check for the existence of a single field to determine if a record already exists in the RESULTS DATA table or not. That single field will be defined by a string constant: INDEXFIELD, e.g.: const IndexField : String = 'IDNUMBER'; If a record already exists, then the record will be replaced. If a record does not exist, a new one will be added to the RESULTS DATA table. Before moving on to the next HTM file, the app will rename the original HTM file by appending the extension ".processed" to its file name. A progress bar will be required, showing the current status of completion based upon how many HTM files still need to be processed. A TMemo will be placed on the main form, which will be used for output/logging/debugging purposes. Each processed HTM file will have logged into the TMemo the following: 1) The full path to the file name 2) The scraped data found within that file e.g. c:\data\[login to view URL] THEDATE: July 8, 2008 IDNUMBER: A9023 RANKING: 23 c:\data\[login to view URL] THEDATE: June 18, 2000 IDNUMBER: B1234 RANKING: 567 Before the program exits/quits, the TMemo needs to be written to disk, using the following file format: [login to view URL] in a \LOGS folder (placed under the application folder). 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. 2) Deliverables must be in ready-to-run condition, as follows? (depending on the nature? of the deliverables): a)? For web sites or? other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment. b) For all others including desktop software or software the buyer intends to distribute: A software? installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request. 3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement). ## Platform Windows 32-bit
Ταυτότητα εργασίας: 2806574

Σχετικά με την εργασία

13 προτάσεις
Απομακρυσμένη Εργασία
Ενεργός/ή 15 χρόνια πριν

Ψάχνεις τρόπο για να κερδίσεις μερικά χρήματα;

Πλεονεκτήματα πλειοδοσίας στο Freelancer

Καθόρισε τον προϋπολογισμό σου και το χρονοδιάγραμμα
Πληρώσου για τη δουλειά σου
Περίγραψε την πρόταση σου
Η εγγραφή και η πλειοδοσία σε εργασίες είναι δωρεάν
Βραβεύτηκε στον/στην:
Avatar Χρήστη
See private message.
$25,50 USD σε 10 ημέρες
5,0 (71 αξιολογήσεις)
6,2
6,2
13 freelancers δίνουν μια μέση προσφορά $64 USD για αυτή τη δουλειά
Avatar Χρήστη
See private message.
$102 USD σε 10 ημέρες
5,0 (29 αξιολογήσεις)
5,6
5,6
Avatar Χρήστη
See private message.
$59,50 USD σε 10 ημέρες
5,0 (63 αξιολογήσεις)
5,1
5,1
Avatar Χρήστη
See private message.
$50,15 USD σε 10 ημέρες
5,0 (46 αξιολογήσεις)
5,0
5,0
Avatar Χρήστη
See private message.
$51 USD σε 10 ημέρες
5,0 (8 αξιολογήσεις)
3,6
3,6
Avatar Χρήστη
See private message.
$85 USD σε 10 ημέρες
5,0 (24 αξιολογήσεις)
3,6
3,6
Avatar Χρήστη
See private message.
$29,75 USD σε 10 ημέρες
4,9 (19 αξιολογήσεις)
3,5
3,5
Avatar Χρήστη
See private message.
$51 USD σε 10 ημέρες
5,0 (4 αξιολογήσεις)
2,4
2,4
Avatar Χρήστη
See private message.
$21,25 USD σε 10 ημέρες
5,0 (2 αξιολογήσεις)
1,3
1,3
Avatar Χρήστη
See private message.
$212,50 USD σε 10 ημέρες
0,0 (0 αξιολογήσεις)
0,0
0,0
Avatar Χρήστη
See private message.
$80,75 USD σε 10 ημέρες
0,0 (0 αξιολογήσεις)
0,0
0,0
Avatar Χρήστη
See private message.
$21,25 USD σε 10 ημέρες
0,0 (0 αξιολογήσεις)
0,0
0,0
Avatar Χρήστη
See private message.
$42,50 USD σε 10 ημέρες
0,0 (1 αξιολόγηση)
0,0
0,0

Σχετικά με τον πελάτη

Σημαία της UNITED STATES
Fredericksburg, United States
4,9
29
Επαληθευμένη μέθοδος πληρωμής
Μέλος από Μαρ 7, 2009

Επαλήθευση Πελάτη

Ευχαριστούμε! Σου έχουμε στείλει ένα email με ένα σύνδεσμο για να διεκδικήσεις τη δωρεάν πίστωση σου.
Κάτι πήγε στραβά κατά την προσπάθεια αποστολής του email σου. Παρακαλούμε δοκίμασε ξανά.
Εγγεγραμμένοι Χρήστες Συνολικές Αναρτημένες Δουλειές
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Φόρτωση προεπισκόπησης
Δόθηκε πρόσβαση για Geolocation.
Η σύνδεση σου έχει λήξει και τώρα έχεις αποσυνδεθεί. Παρακαλούμε συνδέσου ξανά.