Neural Networks

This page lists all neural networks contained in our framework. For each network there is a link to the original paper describing the network and its original implementation. The input format of the network is specified by the path of the directory containing scripts for data conversion to this particular format. Also some further explanation of specific parameters is given.

VRN

Paper: Generative and Discriminative Voxel Modeling with Convolutional Neural Networks
Original Code: https://github.com/ajbrock/Generative-and-Discriminative-Voxel-Modeling
Docker: dockers/vrn
Data: dockers/data_conversion/vrn_data
Framework: Theano with Lasagne

Details: This network takes voxel grid as an input, compressed in .npz format. The size of the voxel grid is set in the dim parameter. The network uses more than one rotation of the 3D model - you can change the number of rotations by setting the num_rotations parameter. This network is very big and slow and even compilation of the model takes up to fifteen minutes.

OCTREE

Paper: O-CNN: Octree-based Convolutional Neural Networks
Original Code: https://github.com/Microsoft/O-CNN
Docker: dockers/octree
Data: dockers/data_conversion/octree_data
Framework: Caffe

Note: Our training/evaluation code achieves lower accuracy - an impelmentation provided by authors (a.k.a. “vanilla” or octree_base) was used instead in the evaluation.

Details: This network is implemented in Caffe framework, which works differently than other frameworks. The network architecture is defined in a .prototxt file and the training procedure is defined in similar file and is called solver. In order to configure the training parameters in the same way as other networks (via config.ini) we replace relevant parts of the net/solver files. These template files are located in the examples directory and the parameters in config.ini are automatically copied into these files, replacing variables, e.g. $BATCHSIZE. (see prepare_nets.py for more details) The solver and net parameters have to contain the paths to solver and net definition file respectively and have to be enclosed in quotation marks. Similarly the snapshot_prefix has to contain full path and has to be enclosed in quotation marks as well. Some of the parameters are in a special section [ITER_PARAMETERS] containing training parameters that are measured in epochs. Parameters in this section will be automatically converted to iterations which Caffe uses internally. The conversion is transprent to the user, only weights of the trained networks will be saved with the number of iterations instead of the number of epochs in the file name.

Note: Caffe may crash with syncedmem.hpp:39] Check failed: error == cudaSuccess (29 vs. 0) driver shutting down or octree_parser.cpp:94] Check failed: h_metadata_ != nullptr after the training process ends (when freeing memory on exit). In that case verify the training finished correctly in by looking at the log and evaluation results in the output folder.

OCTREE ADAPTIVE

Paper: Adaptive O-CNN: A Patch-based Deep Representation of 3D Shapes
Original Code: https://github.com/Microsoft/O-CNN
Docker: dockers/octree_adaptive
Data: dockers/data_conversion/octree_data (with adaptive=True)
Framework: Caffe

Details: Functions the same as the original Octree network described above.

VGG

Paper: Very Deep Convolutional Networks for Large-Scale Image Recognition
Original Code: https://github.com/machrisaa/tensorflow-vgg
Docker: dockers/vgg
Data: dockers/data_conversion/mvcnn_data_pbrt, dockers/data_conversion/mvcnn_data_blender
Framework: TensorFlow

Details: This is an implementation of the classic VGG network which has been modified to vote across multiple views to classify 3D models. Before running the network download the pretrained VGG weights file here and copy it to dockers/vgg folder. As with all following networks using multiple-view images you can set the number of used views in the parameter num_views.

VGG is also used to extract features for the SEQ2SEQ network. If you want to do this set the weights parameter to the number of version of trained VGG you want to use and set extract to True. The features will be saved to the root directory of the dataset, ready to use by the SEQ2SEQ network.

MVCNN

Paper: Multi-view Convolutional Neural Networks for 3D Shape Recognition
Original Code: https://github.com/WeiTang114/MVCNN-TensorFlow
Docker: dockers/mvcnn
Data: dockers/data_conversion/mvcnn_data_pbrt, dockers/data_conversion/mvcnn_data_blender
Framework: TensorFlow

Details: Uses pretrained AlexNet which is prepared automatically. If you want to use different weights, you have to copy them to the Docker container and change the parameter pretrained_network_file.

Note: The implementation differs from the original paper (different network structure).

MVCNN2

Paper: A Deeper Look at 3D Shape Classifiers
Original Code: https://github.com/jongchyisu/mvcnn_pytorch
Docker: dockers/mvcnn2
Data: dockers/data_conversion/mvcnn_data_pbrt, dockers/data_conversion/mvcnn_data_blender
Framework: PyTorch

Details: This network trains in two phases and requires two different batch sizes to be used, so it has two separate parameters for this. You can choose one of several pretrained networks to use by setting the cnn_name parameter. Or you can try training from scratch by setting no_pretraining=True.

ROTATIONNET

Paper: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
Original Code: https://github.com/kanezaki/rotationnet
Docker: dockers/rotnet
Data: dockers/data_conversion/mvcnn_data_pbrt, dockers/data_conversion/mvcnn_data_blender
Framework: Caffe

Details: RotationNet is implemented in Caffe therefore all issues mentioned in the Octree section apply. Only there are two separate network file definitions - one for training and one for testing. Before training, download and extract the pre-trained https://dl.dropboxusercontent.com/s/c6aqns2bvoqi86q/r-cnn-release1-data-ilsvrc2013-caffe-proto-v0.tgz.

SEQ2SEQ

Paper: SeqViews2SeqLabels: Learning 3D Global Features via Aggregating Sequential Views by RNN with Attention
Original Code: https://github.com/mingyangShang/SeqViews2SeqLabels
Docker: dockers/seq2seq
Data: either of (dockers/data_conversion/mvcnn_data_pbrt or dockers/data_conversion/mvcnn_data_blender) processed with dockers/vgg (explained above in the VGG section)
Framework: TensorFlow

Details: This network gets feature vectors extracted by a pretrained image classification network as its input. You can get these features by training and using the VGG network described above. The dimensionality of the feature space is controlled by n_input_fc parameter, VGG vector has 4096 dimensions by default. Paths to .npy files containing extracted features inside the Docker container are controlled by the train_feature_file, train_label_file, test_feature_file, test_label_file parameters.

POINTNET

Paper: PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
Original Code: https://github.com/charlesq34/pointnet
Docker: dockers/pointnet
Data: dockers/data_conversion/pnet_data
Framework: TensorFlow

Details: This network gets point cloud as its input. The number of points used is specified by the num_points parameter. During testing you can use voting across several views to get better results, the number of rotations is controlled by the num_votes parameter.

Note: If you experience loss: nan during training, please verify the num_classes is set correctly in config.ini.

POINTNET++

Paper: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Original Code: https://github.com/charlesq34/pointnet2
Docker: dockers/pointnet2
Data: dockers/data_conversion/pnet_data
Framework: TensorFlow

Details: Same as the original PointNet described above.

SONET

Paper: SO-Net: Self-Organizing Network for Point Cloud Analysis
Original Code: https://github.com/lijx10/SO-Net
Docker: dockers/sonet
Data: dockers/data_conversion/sonet_data
Framework: PyTorch

Details: Similarly to PointNet this network gets a specific number of points as its input specified by the num_points parameter. In addition to points it uses a self organizing network to get better representation of these points. This is computed by sonet_data Docker image/container. The config.ini file contains a great number of parameters, for their explanation check the original paper and the original code.

KDNET

Paper: Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models
Original Code: https://github.com/fxia22/kdnet.pytorch
Docker: dockers/kdnet
Data: dockers/data_conversion/kdnet_data
Framework: PyTorch

Details: This network constructs its input (kd-tree constructed over a uniformly sampled point cloud) on the fly. But you still need to convert mesh data to the .h5 format. You can find several parameters to set in the config.ini file, such as depth of the network (the steps parameter) which also determines the number of used points (2 to the power of depth of the network). You can also explore different data augmentation options under the [DATA_AUGMENTATION] section.