Skip to content Skip to sidebar Skip to footer

Try to Geocode Again if Fails in Geocode R

I've recently wanted to geocode a large number of addresses (recall circa 60k) in Ireland every bit part of a visualisation of the Irish gaelic property market. Geocoding can be simply achieved in R using the geocode() function from the ggmap library. The geocode function uses Googles Geocoding API to turn addresses from text to breadth and longitude pairs very merely.

In that location is a usage limit on the geocoding service for free users of 2,500 addresses per IP address per day. This hard limit cannot be overcome without employing new a IP address, or paying for a business account. To ease the pain of starting an R process every 2,500 addresses / day, I've built the a script that geocodes addresses up the the API query limit every day with a few handy features:

  • Once it hits the geocoding limit, it patiently waits for Google'southward servers to allow it proceed.
  • The script pings Google one time per hr during the downwardly fourth dimension to first geocoding again as presently as possible.
  • A temporary file containing the current data state is maintained during the procedure. Should the script be interrupted, information technology will start over again from the place it left off once any bug with the data /connexion has been rectified.
static map generated with direct url to static map api
Map with google maps static maps API.

The R script assumes that you are starting with a database that is contained in a single *.csv file, "input.csv", where the addresses are contained in the "address" column. Experience costless to use/modify to suit your ain devices!

Comments are included where possible:

# Geocoding script for big list of addresses.   # Shane Lynn 10/10/2013  #load up the ggmap library library(ggmap) # become the input data infile <- "input" data <- read.csv(paste0('./', infile, '.csv'))  # go the accost listing, and append "Ireland" to the end to increase accuracy  # (alter or remove this if your address already include a state etc.) addresses = information$Address addresses = paste0(addresses, ", Ireland")  #define a function that will process googles server responses for us. getGeoDetails <- function(address){       #use the gecode part to query google servers    geo_reply = geocode(address, output='all', messaging=TRUE, override_limit=TRUE)    #at present extract the bits that nosotros need from the returned list    answer <- data.frame(lat=NA, long=NA, accurateness=NA, formatted_address=NA, address_type=NA, status=NA)    reply$status <- geo_reply$status     #if we are over the query limit - desire to suspension for an hour    while(geo_reply$status == "OVER_QUERY_LIMIT"){        impress("OVER QUERY LIMIT - Pausing for ane hr at:")         time <- Sys.fourth dimension()        impress(equally.character(time))        Sys.slumber(60*lx)        geo_reply = geocode(address, output='all', messaging=TRUE, override_limit=TRUE)        answer$status <- geo_reply$status    }     #return Na's if we didn't get a match:    if (geo_reply$condition != "OK"){        return(reply)    }       #else, excerpt what nosotros need from the Google server answer into a dataframe:    respond$lat <- geo_reply$results[[i]]$geometry$location$lat    reply$long <- geo_reply$results[[ane]]$geometry$location$lng       if (length(geo_reply$results[[1]]$types) > 0){        answer$accuracy <- geo_reply$results[[1]]$types[[1]]    }    reply$address_type <- paste(geo_reply$results[[i]]$types, collapse=',')    answer$formatted_address <- geo_reply$results[[1]]$formatted_address     return(answer) }  #initialise a dataframe to agree the results geocoded <- information.frame() # detect out where to commencement in the address listing (if the script was interrupted before): startindex <- ane #if a temp file exists - load it up and count the rows! tempfilename <- paste0(infile, '_temp_geocoded.rds') if (file.exists(tempfilename)){        print("Found temp file - resuming from index:")        geocoded <- readRDS(tempfilename)        startindex <- nrow(geocoded)        print(startindex) }  # Start the geocoding process - accost by address. geocode() part takes intendance of query speed limit. for (ii in seq(startindex, length(addresses))){    print(paste("Working on index", two, "of", length(addresses)))    #query the google geocoder - this will pause here if we are over the limit.    result = getGeoDetails(addresses[ii])     print(result$status)         consequence$index <- 2    #append the respond to the results file.    geocoded <- rbind(geocoded, effect)    #save temporary results as we are going along    saveRDS(geocoded, tempfilename) }  #now nosotros add the latitude and longitude to the main data data$lat <- geocoded$lat data$long <- geocoded$long data$accurateness <- geocoded$accurateness  #finally write it all to the output files saveRDS(information, paste0("../data/", infile ,"_geocoded.rds")) write.table(data, file=paste0("../data/", infile ,"_geocoded.csv"), sep=",", row.names=FALSE)

Let me know if you find a utilize for the script, or if you have whatever suggestions for improvements.

Please exist aware that it is against the Google Geocoding API terms of service to geocode addresses without displaying them on a Google map. Please see the terms of service for more details on usage restrictions.

collieryoureaturs.blogspot.com

Source: https://www.shanelynn.ie/massive-geocoding-with-r-and-google-maps/

ارسال یک نظر for "Try to Geocode Again if Fails in Geocode R"