Manual to 3D Classification Survey

Abstract

We compiled a set of publicly available neural networks for classification of 3D models. The code works with ModelNet40 and ShapeNetCore datasets which are also available online. This is a manual explaining how to convert the datasets, train and test these networks.

Requirements

To run the code you will need a computer with Linux operating system and an NVIDIA GPU.

You will need to install:

NVIDIA drivers (Installation guide here)
Docker version 1.12 or higher (Installation guide here)
NVIDIA Container Runtime for Docker (Installation guide here)

Each neural network is an independent Docker image and all its dependencies are installed when building the image. All code is written in Python.

Datasets Setup

The code is made to work with ModelNet40 and ShapeNetCore datasets. The easiest way to run it with custom dataset is to restructure your data so it copies the structure of one of these datasets.

ModelNet40
- Get the dataset here. For experiments we also used the manually aligned version which can be downloaded here.
- Unpack the downloaded archive and you are ready to go!
ShapeNetCore
- Get the dataset here. You need to register and wait for confirmation email.
- Unpack the downloaded archive.
- Resolve dataset problems (duplicate models and ambigugous categories asignments) with dockers/data_conversion/shapenet_tools.

General Setup

You can download all the code from the paper’s webpage.

Each network is implemented as a separate Docker image. To learn more about Docker, images and containers visit this page.

Each neural network is contained in one directory in /dockers. None of the networks accepts mesh files as their input directly, so some data conversion is required. All data conversion is implemented in a Docker image with the same structure as neural networks images themselves. The code for data conversion is located in /dockers/data_conversion.

Each directory contains two important files - config.ini and run.sh, which you will need to open and edit. Another important file is Dockerfile which contains the definition of the Docker image. Remaining files contain the files which differ from the original network implementation. Original network code is downloaded automatically when building the image.

run.sh is a runnable script which builds the Docker image, runs the Docker container and executes (trains and evaluates) the neural network or runs the data conversion. You will need to setup a couple of variables here:

name - Will be used as a name of the Docker image and Docker container. You can leave this at default value unless it is in conflict with an already existing image or you want to run more instances of this image at once. For data conversion scripts the name is the name of the converted dataset and the directory of the same name will be created in the output directory. The name of the image can be changed by changing the variable image_name in this case.
dataset_path - Contains the path to the root directory of the dataset on your filesystem.
out_path or output_dir - Contains the path to the directory where training logs and network weights will be saved. This directory will be created automatically.
GPU - Index of the GPU which will be visible to Docker container (starting form 0). This variable needs to be a single integer. We currently do not support multiple GPUs.
docker_hidden - Must be one of t or d. With t the container will be run in interactive mode, meaning it will run in your console. With d it will in detached mode i.e. in the background. For more information check Docker documentation here.

config.ini contains most of the relevant parameters of the network or data conversion. The file is split to sections where each section is started by [SECTION] statement. Then on each line a parameter is set in the format key = value. You can find explanation of network parameters in later sections.

Data conversion

To convert your dataset you need to set the parameters described above and then run script run.sh in your console. This will convert the dataset to various formats directly readable by the neural networks.

Parameters for data conversion in config.ini file:

data - The path to the dataset inside the container. Does not have to be changed.
output - The path to the directory inside the container where the converted dataset will be saved. Does not have to be changed.
log_file - The path and name of the file where progress of the training/evaluation or data conversion will be written. By default it is located in the output directory and called log.txt.
num_threads - The maximum number of threads to use.
dataset_type - Denotes which dataset is being converted. Must be one of modelnet or shapenet currently.

Note that the paths are as seen from the running Docker container. The real paths on the host system correspond to the volume mapping (-v host_path:path_in_container in the run.sh file).

For more detail about individual data conversion scripts, continue here.

Neural Networks

Each of the neural networks is implemented in Python but in a different framework. That is why we used the Docker infrastructure. We try to present a unified framework to easily train and test the networks without changing the code. This section will briefly introduce used networks and some of their most important parameters.

Parameters common to all neural networks:

name - Will be used as the name of the experiment used in log files.
data - The path to the dataset inside the container. Does not have to be changed.
log_dir - The path to the directory inside the container where logs and weights will be saved. Does not have to be changed.
num_classes - The number of classes in the dataset. (40 for ModelNet40 and 55 for ShapeNetCore)
batch_size - The number of input examples processed at once for training and testing neural networks.
weights - If you want to test or fine-tune an already trained network, this should be the number of this pre-trained model (e.g. a checkpoint number from previous training). If you want to train from scratch, this should be -1.
snapshot_prefix - The name of the file where weights will be saved. The number of training epochs when these weights are saved will be appended to this prefix.
max_epoch - The number of epochs to train for. One epoch means one pass through the training part of the dataset.
save_period - The trained network will be saved every epoch divisible by save_period.
train_log_frq - The frequency of logging during training. It is roughly the number of examples seen by the network.
test - If you want to only test an already trained network, set this to True. The weights parameter has to have a valid value bigger than -1. Should be False for training.

For more details about individual networks, continue here.

Logging and Evaluation

Our framework offers some basic logging options. It saves several .csv files to the logging directory. The logger keeps track of time of the training, training epochs and some other value.

By default four values are tracked: training loss, training accuracy, test loss and test accuracy. Evaluation on the test set is performed after each epoch of training. Also some basic graphs using matplotlib library are created and saved during training.

When testing your already trained network (using test = True in config.ini, or automatically after the training ends) an evaluation text file [network name].txt is saved, containing true and predicted category along with a simple confusion matrix visualisation ([network name].html). Additional evaluation statistics such as per-category accuracies can be computed from the evaluation text file by manually running the dockers/_common/Evaluation_tools.py script.