Due date: Wednesday, December 3, 2025.
In this exercise, we will use the Keras library to build a neural network and train it on a couple of data sets.
To install Keras on your own computer, you need to install the back-end library first, either TensorFlow, PyTorch, or Jax. The provided code uses TensorFlow, but if you're more familiar with others, you can use them.
If you have an older version of Python, like before Python 3.4, I suggest updating it first. Then from a command line, you can run the commands
pip install tensorflow pip install numpy pip install keras
This should be sufficient for the module to run in command line. If you run it in VS Code, if it doesn't recognize the module, you have to make sure that you use the right interpreter. You can do
which python
in the command line, which will tell you which version of Python the modules were installed for. Then in VS Code, you can do Ctrl-Shift-P, choose Python - Select Interpreter, and then select the version you saw in the command above.
If you aren't able to run the modules on your own computer, you can fall back on the one installed on our Linux systems. Just open an ssh connection to cs01.cs.iusb.edu (or cs02 or cs03), download the files with
wget https://www.cs.iusb.edu/~dvrajito/teach/c463/keras_start.py
and the same for the others. You must run the module with
python3 keras_start.py
a. Files. Download the following files:
keras_start.py
ua_passengers.csv
StudentsPerformance.csv
b. Input Files. Copy the function readCSV that you wrote for Homework 1. Modify it so that if the first line contains alphabetical characters, we assume that it contains the column titles and we ignore it.
The file ua_passengers.csv contains only numerical data below the title. The file StudentsPerformance.csv contains categories as well. You will need to process it to convert it to numerical before using it. You can either use Excel (or some other spreadsheet) and the function IF to convert it to numerical, or write a function in Python for it.
For example, if the cell contains either "none" or "completed", then in Excel, you can write a cell with the formula like
=IF(E2="none", 0, 1)
If you use Excel for this, you will have to create a new CSV file containing the student data converted to numerical.
c. Training. Write a separate Python function for each of the data sets.
For both of them, we will use the last column as the output, so you will need to extract the last column or just save it in a separate data structure to begin with. For both of them, we will use the first 90% of the data for training and the last 10% for testing.
d. Experiments. Experiment with varying numbers of epochs to see how it improves the results. Try using all the columns or a subset of the columns to see if that makes it better or worse. Document what parameter settings you used and how they influenced the outcome in a text file (or Word document) that you also submit with the homework.
Upload all the Python files you have modified or created, the document with the summary of your results, as well as the csv file of the conversion to numerical of the second data set if you did that. Otherwise, make sure that the processing Python function for this data set is included. Upload all the files to Canvas in Assignments - Homework 11.