Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Simple Data Processing With JavaScript And The HERE API

TwitterFacebookRedditLinkedInHacker News

Have you ever needed to work with comma separated value (CSV) data that wasn’t formatted in a great way or figure out complete address information based on very little provided address information? While unrelated, these two topics come up quite a bit, more frequently when I’m dealing with person information or lead data that I retrieve from conferences and other events.

The great thing is that we live in a time where plenty of development libraries and services exist to make this process of data parsing and manipulation easy to accomplish in an automated fashion.

We’re going to see how to take a CSV file representing partially complete people data and convert it to JSON. Then we’re going to fill in the gaps when it comes to the geolocation side of things, using the HERE Geocoder API.

Determining a Valid Scenario for Geocoding and Data Parsing

Let’s paint a more appropriate picture of what we hope to accomplish. Let’s imagine that you’ve collected leads from a conference and at the end of the conference you got those leads in a CSV file. That CSV file might contain data that looks like the following:

firstnamelastnamecompanyemailphone_numberlocation
JohnDoeFictional Companyjohn@example.com510-123-4567Napa, USA
NicRaboyThe Polyglot Developertest@example.com209-123-4567Tracy, USA
JaneDoeMade Up Companyjane@example.com559-123-4567California, USA

The above data is great and all, but there is much to be desired. If we continue down the path of conference leads for an organization that sponsored an event, we might be thinking about the following scenarios:

  1. What format of data does my CRM like Salesforce expect when it comes to lead data?
  2. Which sales representative will get which lead based on the incompleteness of the location data?
  3. Isn’t it going to be a pain to manually figure out the state and postal code information for leads that are missing it?

Our goal now is going to be to work with this CSV data and make the location data complete enough to use.

Developing a Node.js Application for Data Processing and Location Parsing

To be successful at this, we’re going to create a new project and work our way up. Somewhere on your computer, create a new directory and navigate into it with the Terminal and execute the following:

npm init -y

The above command will create a new package.json file which will be our project’s blueprint. The next step is to download the project dependencies. From the command line execute the following:

npm install csvtojson --save
npm install request --save
npm install request-promise --save

We’re going to be using the csvtojson package to convert our CSV data to JSON and the request-promise package to make HTTP requests against the HERE API. The request-promise package depends on the vanilla request package.

For code cleanliness, we’re going to separate our project into three files. From the command line, execute the following:

touch app.js
touch here.js
touch config.json

If you don’t have the touch command, create those files manually. All of our HERE API logic will go in the here.js file and all of our parsing logic will go in the app.js file. The config.json file will hold our keys and other hard-coded information.

Before we can start development, we need to have the appropriate HERE API keys in hand. If you haven’t already, create a HERE account and head to your project’s dashboard.

HERE Development Project Dashboard

You’re going to need the App ID and App Code from this dashboard.

Now that we have what we need, we can start development with JavaScript and Node.js. Open the project’s here.js file and include the following so we can get the RESTful API working:

const Request = require("request-promise");
const config = require("./config.json");

module.exports = class Here {

    constructor() { }

    getAddressInformation(query) {
        var addresses = [];
        return Request({
            uri: "https://geocoder.cit.api.here.com/6.2/geocode.json",
            qs: {
                "app_id": config.app_id,
                "app_code": config.app_code,
                "searchtext": query
            },
            json: true
        }).then(result => {
            if(result.Response.View.length > 0) {
                for(var i = 0; i < result.Response.View[0].Result.length; i++) {
                    addresses.push({
                        country: result.Response.View[0].Result[i].Location.Address.Country,
                        state: result.Response.View[0].Result[i].Location.Address.State,
                        city: result.Response.View[0].Result[i].Location.Address.City,
                        postal_code: result.Response.View[0].Result[i].Location.Address.PostalCode,
                        county: result.Response.View[0].Result[i].Location.Address.County,
                    });
                }
            }
            return addresses;
        });
    }

}

Notice that we are creating a class with a function for making HTTP requests. The HTTP request is to the HERE API and we’re passing in the required parameters as query parameters. When we receive results, we are first checking to make sure location data exists and if it does, push it into an array. Depending on the search query, you may end up with no location results or many location results.

Since we are using the config.json file within our API, we should probably fill it. Open the config.json file and include the following:

{
    "app_id": "YOUR_APP_ID_HERE",
    "app_code": "YOUR_APP_CODE_HERE",
    "area_codes": {
        "usa": "./usa-area-codes-with-abbr.json"
    }
}

You’re probably wondering what the area code data is, but we’re going to hold off on that for now. Just make sure it exists in your configuration and make sure you replace the values with that of your actual key data.

Now we can focus on the driving logic of the application. Open the project’s app.js file and include the following:

const CSVToJSON = require("csvtojson");

const config = require("./config.json");
const Here = require("./here.js");
const AreaCodes = require(config.area_codes.usa);

var here = new Here();

CSVToJSON().fromFile("./data.csv").then(leads => {
    var leadInformation = [];
    for(var i = 0; i < leads.length; i++) {
        leadInformation.push(here.getAddressInformation(leads[i].location));
    }
    Promise.all(leadInformation).then(result => {
        // ...
    });
}, error => {
    throw error;
});

In the above code, we’ve included our config.json file as well as our here.js file. We’ve also loaded the potential area codes file that we referenced in the config.json file.

The first step is to load our CSV file and convert it to JSON. Once we have an array of objects, we can loop through them and pass the location property, formally the location column, into our getAddressInformation function. Remember, the RESTful API is asynchronous, and for this reason we are going to make sure all requests complete before working with them, hence the Promise.all command.

We know our data sample, so if I were to pass “Napa, USA” into the function, I’d get the following back:

{
    country: 'USA',
    state: 'CA',
    city: 'Napa',
    postal_code: '94559',
    county: 'Napa'
}

The above data is definitely more complete and usable for the sales team that receives it. We’re going to do a data manipulation now. Let’s look at that Promise.all command:

Promise.all(leadInformation).then(result => {
    for(var i = 0; i < leads.length; i++) {
        leads[i].location = result[i][0];
    }
    console.log(leads);
});

Since we waited for all the requests to finish, the leads are in the same order as the results containing addresses. In the above code we are taking the HERE API response and replacing the old, potentially crazy, location query.

However, what if we were using the “Tracy, USA” example? If we pass “Tracy, USA” into the HERE API, we’re going to get the following:

[
    { "city": "Tracy", "state": "CA", "postal_code": "95376", "county": "San Joaquin", "country": "USA" },
    { "city": "Tracy", "state": "MO", "postal_code": "60479", "county": "Platte", "country": "USA" },
    { "city": "Tracy", "state": "IA", "postal_code": "50256", "county": "Marion", "country": "USA" },
]

Only one instance of “Napa, USA” existed, but multiple instances of “Tracy, USA”. Which one do we use?

This is where your imagination has to come in. There is no right or wrong way to do this, but there are multiple ways to do this. We could take the company information and use that address, but the person might be in a satellite location or work on a remote team. Instead, I think it would be valuable to use the phone number information.

Remember that area codes file I told you to ignore for now? We’re going to take a look at it. In the United States, area codes for phone numbers are unique to the state. If we know the area code we know the state and then we might know which of the locations from the HERE API are accurate. This data does exist online, but I’ve made my own that you can download here. To give credit where credit is due, I actually obtained the area code data from Github and made some of my own manipulations to get it in the format that I wanted.

With that information, our logic now looks like this:

const CSVToJSON = require("csvtojson");

const config = require("./config.json");
const Here = require("./here.js");
const AreaCodes = require(config.area_codes.usa);

var here = new Here();

CSVToJSON().fromFile("./data.csv").then(leads => {
    var leadInformation = [];
    for(var i = 0; i < leads.length; i++) {
        leadInformation.push(here.getAddressInformation(leads[i].location));
    }
    Promise.all(leadInformation).then(result => {
        for(var i = 0; i < leads.length; i++) {
            if(result[i].length > 1 && leads[i].phone_number && leads[i].phone_number.length > 3) {
                var area_code = leads[i].phone_number.substring(0, 3);
                for(var j = 0; j < result[i].length; j++) {
                    if(result[i][j].state == AreaCodes[area_code].abbr) {
                        leads[i].location = result[i][j];
                        break;
                    }
                }
            } else if(result[i].length == 1) {
                leads[i].location = result[i][0];
            }
        }
        console.log(leads);
    });
}, error => {
    throw error;
});

We loop through our location data and leads as normal. If the address information returned from the HERE API contains more than one address, we check the phone number and pull out the first three digits which is the area code. My phone number validation logic is non-existent. Using the area code, we can compare the state with the state returned from the HERE API. When there is a match, we’re going to use it. If there is only one address, we are going to use that without getting the phone number involved.

Again, this is not foolproof when it comes to finding the correct address.

Conclusion

You just saw how to take a CSV file that contained people information such as location, parse it into JSON, then query for more reliable location data using the HERE Geocoder API. If we wanted to, we could convert our data back into CSV so it could be loaded into a CRM.

There is a lot of room for optimization in my code, but it should give you ideas on what you can do with file parsing in Node.js and using a location API for cleaning up location data.

Nic Raboy

Nic Raboy

Nic Raboy is an advocate of modern web and mobile development technologies. He has experience in C#, JavaScript, Golang and a variety of frameworks such as Angular, NativeScript, and Unity. Nic writes about his development experiences related to making web and mobile development easier to understand.