Command Line Interface and Nokogiri

This post showcases the command line interface (CLI) application that I created as part of the curriculum for the Software Engineering program at Flatiron School. A CLI is a simple user interface that is text-based and is commonly used to manage and view files on your computer. The purpose of this project was to create a CLI application in which data was accessed from an outside source (a web page). The application needed to use object-oriented design patterns which is why our focus was to convert outside data into useable internal objects within our application.

Introduction

The application I built is called “Hiking Trails Locator” and its primary function is to obtain information on hiking trails from anywhere in the United States via a user-inputted zip code. I wanted to build something that I was personally interested in, as well as something that had enough data that I could play around with and learn from. As an avid hiker, I thought it could be useful to have a simple application that provided hiking trail information in a concise form without the unnecessary details.

Image for post
Image for post
There are many beautiful hiking trails within 100 miles of where I live.

To get started, I created a simple flow diagram on my whiteboard to map out functionality for the application. I knew I wanted to provide the user with a choice of trails in a summarized list that could be easily read without scrolling. I also wanted the user to be able to go back to the list of trails without starting their search over again. The updated flow diagram for the project can be found below.

Image for post
Image for post
Flow digram for the Hiking Trails Locator application.

Setup

The setup for this application included installing the following Ruby gems:
Geocoder, Nokogiri, JSON, Pry, and Colorize. In addition, Ruby standard gems Open-URI and Net/HTTP were also used in the data gathering process.

I used the mac Terminal to create new directories and files to house the application on my local machine. Since this was my first project, I learned how to create a new Github repository as well as how to make commits and pushes to said repository.

The application has the following files and directories:

bin/hike — This is the executable file that is responsible for running the application. It creates a new instance of the CLI class ‘TrailSearcher’.
config/environment.rb— This file is responsible for requiring or “bringing in” all of the gems and Ruby files in the ‘lib’ directory so that our executable file is able to read the code correctly.
lib/trail.rb — This is the model class which is our object-oriented model responsible for creating and storing individual trails as objects.
lib/trail-searcher.rb — This is the CLI class that is responsible for the operation of the CLI. It calls and interacts with all other files in the ‘lib’ directory.
lib/trail_importer.rb — This is the Application Programming Interface (API) class that is responsible for data get requests to the Hiking Project API.
lib/trail_det_importer.rb— This is the class that is responsible for data scraping the Hiking Project URL associated with a specific trail.
Gemfile— This file manages the gems used throughout this application.

API For Gathering Preliminary Trail Data

The Hiking Project provides a very useful API that is accessible with a private API key. The TrailImporter class possesses the power and functionality to gather data from the API. This class stores the private API key in a class variable. The method .get_trails_by_lat_long_dist(lat, long, dist)is a class method that is called on after the user provides a five digit zip code and distance to search (in miles). There are three parameters that are passed into this method: latitude, longitude and distance. The powerful gem Geocoder was used to convert the user’s input of a zip code into a latitude and longitude.

The class method uses the Open-URI and Net/HTTP gems to request, gather and parse data into a useable form. The TrailImporter class pulls the trail identification number, trail name, summary, trail length and url from the API and stores it in a new hash with those identifiers as the keys. The gathering of this preliminary data is important because a list of available trails can be generated and then displayed to the user with this information. The URL that is pulled is significant because it provides a web page address that includes further details about a specific trail. This URL will be used when we want to scrape the page for information such as trail description and elevation.

Image for post
Image for post
The TrailImporter class which gathers data from the Hiking Project API by using a free private API key.

Scraping a Specific Trail URL

The TrailDetailImporter class has a single class method that passes in the URL for a specific trail that the user wants to know more about. The URL is pulled from the API class and saved as an attribute of an object. The reason a web page scrape was necessary is because the API only provides limited, “big-picture” information. The most important yield from the API is the trail URL attribute because it provides us a source of information to gather additional details. The information available at the URL provides specificity which complements the information gathered from the API.

The Ruby gems Open-URI and Nokogiri were used to scrape the trail detail page for the following attributes: trail name, difficulty, description, length, route, high elevation, low elevation, elevation gain and dogs allowed. The description attribute was difficult to gather because some of the trail specific URLs’ cascading style sheets (CSS) layouts were not all the same. An if/else statement was used to account for the cases in which the description was not available. The return value of this class method is a single hash with multiple key/value pairs.

Creating Trail Objects With the Model Class

The representation of the data I gathered from the API and scraped from web pages is shown in the model class named Trail. This class creates instances of Trail, which is to say each instance is an individual trail object. Each object has attributes that are generated from both external data sources.

The class method .create_from_collection(trail_array) passes in an array of all of the trails that meets the latitude, longitude and distance criteria. Each element of the array represents an individual trail. Inside each element is a hash containing the five key/value pairs that was created in the TrailImporter class. This method iterates over the passed in array and calls self.new on each element (hash) of the array. When self.new is called, the initialize method kicks in.

The instance method #add_trail_attributes(attributes_hash) passes in a hash of the attributes, which are the trail detail attributes. This method is called on after the object is created but before the trail details are displayed to the user. By iterating over the hash, we use the self.send method to assign instance variables to an already existing instance of Trail.

Image for post
Image for post
New objects are created in the Trail class by calling create_from_collection on the class and passing in an array. Attributes are added to an existing Trail instance by calling add_trail_attributes on the instance and passing in a hash of attributes.

The initialize method uses metaprogramming by iterating over the passed in hash and calling the send method on self which is the instance that is currently being created. The send method will call on the attr_writer method and utilize the key inside the hash as the instance variable and assign itself to the corresponding value in the hash. This creates a dynamic object for each trail and allows us to assign attributes efficiently with minimal code.

The .all class method is getter/reader method that allows us access to the contents of the @@all array of trails. The .sort_all class method sorts the @@all array by ascending order of trail length.

Image for post
Image for post
These class methods provide access all saved instances of trails and sorts them according to trail length.

The Trail Searcher CLI Class

The final class in the application is the CLI class TrailSearcher which manages the interaction between the user and the rest of the code.

In order to run the application, the bin/hike file is executed in the terminal. This file creates a new instance of the TrailSearcher class and assigns it to a new variable. The #run method is then called on the instance and the application begins. The #run method invokes four primary instance methods: #greeting, #prompt_zip, #prompt_trail_details, and #exit_prompt.

The #greeting method is straightforward and prompts the user for their name, greets them by their name, and then shows the title of the application with a one sentence description.

The #prompt_zip method first prompts the user for a five digit zip code and then validates it using Regex. The #zip_conversion(zip_code) method is then invoked upon validation in which the Geocoder gem is used to convert the zip code to a latitude and longitude, each of which are assigned to separate variables. Geocoder provides other useful information, such as the city and state, which can also be stored in variables. This method then calls on the #prompt_distance method to prompt the user for a distance in miles and validates their input. Next, this method calls the #get_trails(lat, long, dist, city, state, zip) method which calls the TrailImporter class to request and gather data from the API. New instances of the Trail class are then instantiated which creates multiple objects with accessible attributes. The #list_trails method is then invoked, which sorts the class variable @@all in the Trail class and prints out a numbered list with the trail name, trail length, and summary.

Image for post
Image for post
This method pulls data from the API, creates Trail instances, and lists the trails.

The #prompt_trail_details method prompts the user to enter a number corresponding to the trail they would like to get additional details for. After the input is validated, the #get_trail_details(trail_num) method is called which invokes the TrailDetailerImporter class by calling its own#get_trail_details(trail_url) method and passing in the url for that specific trail. Then, #list_trail_details(user_trail) is called with the passed in user selected trail and then prints out additional details for that particular trail.

The final method that the #run method calls is #exit_prompt. This method prompts the user one time and gives three options: go back to list of trails, enter a new zip code, or exit the application.

Validation and Errors

There are five prompts the user receives in this application. The first is the prompt for the user’s name. Since this is a person’s name, I used the #match method with a regex argument to match any input that contains lower and/or upper case letters.

The second prompt asks for a zip code to search for trails. I again used the #match method with a regex argument and limited input to only 5 digits. Once validated, the zip code is passed in using the #search method of the Geocoder gem.

The third prompt asks the user for a distance in miles to extend the search radius. To validate this input it needed to have be a number between 1 and 100. So I used a range of 1 to 100 and checked if the user input was in that range. In addition, I used regex to ensure only numbers would be entered in.

The fourth prompt asks the user for a number corresponding to the number displayed inside the trail list. The validation check for this was simply ensuring the input was an integer and that it was in the range of 1 to however many items were in the Trail.all array.

The final prompt gives the user three options. The user input was validated by only allowing the exact input to be understand by the program. So the comparative operator was used to check if user_input == "1" or "2" or "exit".

Error messages alerting the user of invalid input and/or unavailable information are implemented throughout the program. There is an else statement for every user input method to inform and provide feedback to the user so that they can continue using the application correctly. When errors are made, a message is printed to the console showing them that their input was invalid. A call to the original method was implemented after the message is printed in order to re-prompt the user and start the process over again. An until loop is used in the #exit_prompt method to ensure the user is continually prompted until an acceptable input is received.

Issues and Fixes

I ran into two main issues coding this application. The first issue occurred while refactoring the original code. Several of my methods in the CLI class TrailSearcher were unnecessarily long so I split methods up into multiple methods instead. Originally my code did not have any instance variables in this class. But I wanted to display error messages in such a way that clearly showed the user their input and why it was invalid. Also, after every valid input, I wanted the user to see in print what they had inputted as a logical sentence so they could track in the terminal what occurred. To implement this, I needed to ensure that I had access to particular variables that were local to individual methods. So after splitting the methods up into smaller methods, I changed those local variables to instance variables instead. I later realized that instance variables in a CLI class did not make sense logically, so I reverted them back to local variables. I learned that I could just pass in arguments of those local variables to the smaller methods that I created which would allow access to those variables throughout multiple methods. This can be seen in the #zip_conversion(zip_code) method which invokes the #prompt_distance(lat,long,city,state,zip_code) method. I used this idea of refactoring several other times to refactor other lengthy methods.

Image for post
Image for post
These two methods were initially one very long method so refactoring into smaller methods became necessary. All five variables are now accessible when prompting for distance because this method passes in those arguments in the zip_conversion method.

Another issue I came across was implementing functionality to check for existing data versus pulling new data twice. I originally coded the application without the ability to check for existing objects in the Trail class so that I could get the application up and running. When I reworked the code to add this new functionality of checking to see if objects already existed, I discovered there were two locations this could occur. The first location is after the user chooses a number and is displayed specific trail details. They are given the option to go back to the list that was generated. This is where the #get_trail_details(trail_num) method implements logic to check if the trail details for that specific trail already exists in the @@all array of our Trail class. So when a user wants to go back to the original list that was generated, the program is able to pull that same list again, which is encased in the .sort_all method of the Trail class which allows the program to display the exact same list as before without pulling the data from the API over again.

In the #get_trail_details(trail_num) method, I initially struggled with implementing the correct logic to only do a web page scrape if the object and all of its attributes already existed. After trying many ways to code this, I eventually ended up iterating over the sorted list of Trail.all and checking if the trail (via the user inputted number converted to an index) description attribute was == nil which meant that the object’s detailed attributes from the scrape did not yet exist. If the detailed attributes did not exist, then they would be added to that particular instance of Trail using the #add_trail_attributes(detail_hash) method of the Trail class. At this point, no new objects will be created since the objects for all listed trails already exist. So instead of creating a completely new instance, attributes are simply added to the object when necessary.

Image for post
Image for post
The get_trail_details method implements logic to avoid scraping the same data twice.

When list_trail_details(user_trail) is called, the passed in user selected trail is utilized to print out nine attribute values of the object.

Image for post
Image for post
The code that prints trail details to the console.

The Working Application

Here is an example of running through the application:

Image for post
Image for post
When the app is first run, you will see the greeting, name prompt and introductory message.
Image for post
Image for post
The user is then prompted to enter a zip code and distance.
Image for post
Image for post
The list of up to 10 trails are displayed in order of trail length, followed by a prompt for a number.
Image for post
Image for post
The trail details for the requested trail are displayed.
Image for post
Image for post
The user has three options to choose from.
Image for post
Image for post
And finally, you can go hiking!

The Github repository for this application can be found here:
https://github.com/dougschallmoser/hiking-trails-locator-cli-app

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store