"ML Mondays"

"ML Mondays"

  • Docs
  • Data
  • Models
  • API
  • Help
  • Blog

›API

ML Mondays

  • Overview

Data

  • Data

Models and Workflows

  • Models

API

  • ML Mondays API

ML Mondays API

This is the class and function reference of the ML Mondays course code

1_ImageRecog

General workflow using your own data

  1. Create a TFREcord dataset from your data, organised as follows:
  • copy training images into a folder called train
  • copy validation images into a folder called validation
  • ensure the class name is written to each file name. Ideally this is a prefix such that it is trivial to extract the class name from the file name
  • modify one of the provided workflows (such as tamucc_make_tfrecords.py) for your dataset, to create your train and validation tfrecord shards
  1. Set up your model
  • Decide on whether you want to train a small custom model from scratch, a large model from scratch, or a large model trained using weights transfered from another task
  • If a small custom model, use make_cat_model with shallow=True for a relatively small model, and shallow=False for a relatively large model
  • If a large model with transfer learning, decide on which one to utilize (transfer_learning_mobilenet_model, transfer_learning_xception_model, or transfer_learning_model_vgg)
  • If you wish to train a large model from scratch, decide on which one to utilize (mobilenet_model, or xception_model)
  1. Set up a data pipeline
  • Modify and follow the provided examples to create a get_training_dataset() and get_validation_dataset(). This will likely require you copy and modify get_batched_dataset to your own needs, depending on the format of your labels in filenames, by writing your own read_tfrecord function for your dataset (depending on the model selected)
  1. Set up a model training pipeline
  • .compile() your model with an appropriate loss function and metrics
  • define a LearningRateScheduler function to vary learning rates over training as a function of training epoch
  • define an EarlyStopping criteria and create a ModelCheckpoint to save trained model weights
  • if transfer learning using weights not from imagenet, load your initial weights from somewhere else
  1. Train the model
  • Use history = model.fit() to create a record of the training history. Pass the training and validation datasets, and a list of callbacks containing your model checkpoint, learning rate scheduler, and early stopping monitor)
  1. Evaluate your model
  • Plot and study the history time-series of losses and metrics. If unsatisfactory, begin the iterative process of model optimization
  • Use the loss, accuracy = model.evaluate(get_validation_dataset(), batch_size=BATCH_SIZE, steps=validation_steps) function using the validation dataset and specifying the number of validation steps
  • Make plots of model outputs, organized in such a way that you can at-a-glance see where the model is failing. Make use of make_sample_plot and p_confmat, as a starting point, to visualize sample imagery with their model predictions, and a confusion matrix of predicted/true class-correspondences

model_funcs.py

Model creation


transfer_learning_model_vgg
transfer_learning_model_vgg(num_classes, input_shape, dropout_rate=0.5)

This function creates an implementation of a convolutional deep learning model for estimating a discrete category based on vgg, trained using transfer learning (initialized using pretrained imagenet weights)

  • INPUTS:
    • num_classes = number of classes (output nodes on classification layer)
    • input_shape = size of input layer (i.e. image tensor)
  • OPTIONAL INPUTS:
    • dropout_rate = proportion of neurons to randomly set to zero, after the pooling layer
  • GLOBAL INPUTS: None
  • OUTPUTS: keras model instance

mobilenet_model
mobilenet_model(num_classes, input_shape, dropout_rate=0.5)

This function creates an implementation of a convolutional deep learning model for estimating a discrete category based on mobilenet, trained from scratch

  • INPUTS:
    • num_classes = number of classes (output nodes on classification layer)
    • input_shape = size of input layer (i.e. image tensor)
  • OPTIONAL INPUTS:
    • dropout_rate = proportion of neurons to randomly set to zero, after the pooling layer
  • GLOBAL INPUTS: None
  • OUTPUTS: keras model instance

transfer_learning_mobilenet_model
transfer_learning_mobilenet_model(num_classes, input_shape, dropout_rate=0.5)

This function creates an implementation of a convolutional deep learning model for estimating a discrete category based on mobilenet v2, trained using transfer learning (initialized using pretrained imagenet weights)

  • INPUTS:
    • num_classes = number of classes (output nodes on classification layer)
    • input_shape = size of input layer (i.e. image tensor)
  • OPTIONAL INPUTS:
    • dropout_rate = proportion of neurons to randomly set to zero, after the pooling layer
  • GLOBAL INPUTS: None
  • OUTPUTS: keras model instance

transfer_learning_xception_model
transfer_learning_xception_model(num_classes, input_shape, dropout_rate=0.25)

This function creates an implementation of a convolutional deep learning model for estimating a discrete category based on xception, trained using transfer learning (initialized using pretrained imagenet weights)

  • INPUTS:
    • num_classes = number of classes (output nodes on classification layer)
    • input_shape = size of input layer (i.e. image tensor)
  • OPTIONAL INPUTS:
    • dropout_rate = proportion of neurons to randomly set to zero, after the pooling layer
  • GLOBAL INPUTS: None
  • OUTPUTS: keras model instance

xception_model
xception_model(num_classes, input_shape, dropout_rate=0.25)

This function creates an implementation of a convolutional deep learning model for estimating a discrete category based on xception, trained from scratch

  • INPUTS:
    • num_classes = number of classes (output nodes on classification layer)
    • input_shape = size of input layer (i.e. image tensor)
  • OPTIONAL INPUTS:
    • dropout_rate = proportion of neurons to randomly set to zero, after the pooling layer
  • GLOBAL INPUTS: None
  • OUTPUTS: keras model instance

conv_block
conv_block(inp, filters=32, bn=True, pool=True)

This function generates a convolutional block

  • INPUTS:
    • inp = input layer
  • OPTIONAL INPUTS:
    • filters = number of convolutional filters to use
    • bn=False, use batch normalization in each convolutional layer
    • pool=True, use pooling in each convolutional layer
    • shallow=True, if False, a larger model with more convolution layers is used
  • GLOBAL INPUTS: None
  • OUTPUTS: keras model layer object

make_cat_model
make_cat_model(num_classes, dropout, denseunits, base_filters, bn=False, pool=True, shallow=True)

This function creates an implementation of a convolutional deep learning model for estimating a discrete category

  • INPUTS:
    • num_classes = number of classes (output nodes on classification layer)
    • dropout = proportion of neurons to randomly set to zero, after the pooling layer
    • denseunits = number of neurons in the classifying layer
    • base_filters = number of convolutional filters to use in the first layer
  • OPTIONAL INPUTS:
    • bn=False, use batch normalization in each convolutional layer
    • pool=True, use pooling in each convolutional layer
    • shallow=True, if False, a larger model with more convolution layers is used
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS: keras model instance

Model training


lrfn
lrfn(epoch)

This function creates a custom piecewise linear-exponential learning rate function for a custom learning rate scheduler. It is linear to a max, then exponentially decays

  • INPUTS: current epoch number
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS:start_lr, min_lr, max_lr, rampup_epochs, sustain_epochs, exp_decay
  • OUTPUTS: the function lr with all arguments passed

tfrecords_funcs.py

TF-dataset creation


get_batched_dataset
get_batched_dataset(filenames)

This function defines a workflow for the model to read data from tfrecord files by defining the degree of parallelism, batch size, pre-fetching, etc and also formats the imagery properly for model training (assumes mobilenet by using read_tfrecord_mv2)

  • INPUTS:
    • filenames [list]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: BATCH_SIZE, AUTO
  • OUTPUTS: tf.data.Dataset object

get_eval_dataset
get_eval_dataset(filenames)

This function defines a workflow for the model to read data from tfrecord files by defining the degree of parallelism, batch size, pre-fetching, etc and also formats the imagery properly for model training (assumes mobilenet by using read_tfrecord_mv2). This evaluation version does not .repeat() because it is not being called repeatedly by a model

  • INPUTS:
    • filenames [list]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: BATCH_SIZE, AUTO
  • OUTPUTS: tf.data.Dataset object

resize_and_crop_image
resize_and_crop_image(image, label)

This function crops to square and resizes an image. The label passes through unmodified

  • INPUTS:
    • image [tensor array]
    • label [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • label [int]

recompress_image
recompress_image(image, label)

This function takes an image encoded as a byte string and recodes as an 8-bit jpeg. Label passes through unmodified

  • INPUTS:
    • image [tensor array]
    • label [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]
    • label [int]

TFRecord reading


file2tensor
file2tensor(f, model='mobilenet')

This function reads a jpeg image from file into a cropped and resized tensor, for use in prediction with a trained mobilenet or vgg model (the imagery is standardized depending on target model framework)

  • INPUTS:
    • f [string] file name of jpeg
  • OPTIONAL INPUTS:
    • model = {'mobilenet' | 'vgg'}
  • OUTPUTS:
    • image [tensor array]: unstandardized image
    • im [tensor array]: standardized image
  • GLOBAL INPUTS: TARGET_SIZE

read_classes_from_json
read_classes_from_json(json_file)

This function reads the contents of a json file enumerating classes

  • INPUTS:
    • json_file [string]: full path to the json file
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: CLASSES [list]: list of classesd as byte strings

read_tfrecord_vgg
read_tfrecord_vgg(example)

This function reads an example record from a tfrecord file and parses into label and image ready for vgg model training

  • INPUTS:
    • example: an tfrecord 'example' object, containing an image and label
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor]: resized and pre-processed for vgg
    • class_label [tensor] 32-bit integer

read_tfrecord_mv2
read_tfrecord_mv2(example)

This function reads an example record from a tfrecord file and parses into label and image ready for mobilenet model training

  • INPUTS:
    • example: an tfrecord 'example' object, containing an image and label
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor]: resized and pre-processed for mobilenetv2
    • class_label [tensor] 32-bit integer

read_tfrecord
read_tfrecord(example)

This function reads an example from a TFrecord file into a single image and label

  • INPUTS:
    • TFRecord example object
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor int]

read_image_and_label
read_image_and_label(img_path)

This function reads a jpeg image from a provided filepath and extracts the label from the filename (assuming the class name is before "IMG" in the filename)

  • INPUTS:
    • img_path [string]: filepath to a jpeg image
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor int]

TFRecord creation


get_dataset_for_tfrecords
get_dataset_for_tfrecords(recoded_dir, shared_size)

This function reads a list of TFREcord shard files, decode the images and label resize and crop the image to TARGET_SIZE, and create batches

  • INPUTS:
    • recoded_dir
    • shared_size
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS: tf.data.Dataset object

write_records
write_records(tamucc_dataset, tfrecord_dir, CLASSES)

This function writes a tf.data.Dataset object to TFRecord shards

  • INPUTS:
    • tamucc_dataset [tf.data.Dataset]
    • tfrecord_dir [string] : path to directory where files will be written
    • CLASSES [list] of class string names
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (files written to disk)

to_tfrecord
to_tfrecord(img_bytes, label, CLASSES)

This function creates a TFRecord example from an image byte string and a label feature

  • INPUTS:
    • img_bytes: an image bytestring
    • label: label string of image
    • CLASSES: list of string classes in the entire dataset
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: tf.train.Feature example

plot_funcs.py


plot_history
plot_history(history, train_hist_fig)

This function plots the training history of a model

  • INPUTS:
    • history [dict]: the output dictionary of the model.fit() process, i.e. history = model.fit(...)
    • train_hist_fig [string]: the filename where the plot will be printed
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (figure printed to file)

get_label_pairs
get_label_pairs(val_ds, model)

This function gets label observations and model estimates

  • INPUTS:
    • val_ds: a batched data set object
    • model: trained and compiled keras model instance
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • labs [ndarray]: 1d vector of numeric labels
    • preds [ndarray]: 1d vector of correspodning model predicted numeric labels

p_confmat
p_confmat(labs, preds, cm_filename, CLASSES, thres = 0.1)

This function computes a confusion matrix (matrix of correspondences between true and estimated classes) using the sklearn function of the same name. Then normalizes by column totals, and makes a heatmap plot of the matrix saving out to the provided filename, cm_filename

  • INPUTS:
    • labs [ndarray]: 1d vector of labels
    • preds [ndarray]: 1d vector of model predicted labels
    • cm_filename [string]: filename to write the figure to
    • CLASSES [list] of strings: class names
  • OPTIONAL INPUTS:
    • thres [float]: threshold controlling what values are displayed
  • GLOBAL INPUTS: None
  • OUTPUTS: None (figure printed to file)

make_sample_plot
make_sample_plot(model, sample_filenames, test_samples_fig, CLASSES))

This function computes a confusion matrix (matrix of correspondences between true and estimated classes) using the sklearn function of the same name. Then normalizes by column totals, and makes a heatmap plot of the matrix saving out to the provided filename, cm_filename

  • INPUTS:
    • model: trained and compiled keras model
    • sample_filenames: [list] of strings
    • test_samples_fig [string]: filename to print figure to
    • CLASSES [list] os trings: class names
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (matplotlib figure, printed to file)

compute_hist
compute_hist(images)

Compute the per channel histogram for a batch of images

  • INPUTS:
    • images [ndarray]: batch of shape (N x W x H x 3)
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • hist_r [dict]: histogram frequencies {'hist'} and bins {'bins'} for red channel
    • hist_g [dict]: histogram frequencies {'hist'} and bins {'bins'} for green channel
    • hist_b [dict]: histogram frequencies {'hist'} and bins {'bins'} for blue channel

plot_distribution
plot_distribution(images, labels, class_id, CLASSES)

Compute the per channel histogram for a batch of images

  • INPUTS:
    • images [ndarray]: batch of shape (N x W x H x 3)
    • labels [ndarray]: batch of shape (N x 1)
    • class_id [int]: class integer to plot
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: matplotlib figure

plot_one_class
plot_one_class(inp_batch, sample_idx, label, batch_size, CLASSES, rows=8, cols=8, size=(20,15))

Plot batch_size images that belong to the class label

  • INPUTS:
    • inp_batch [ndarray]: batch of N images
    • sample_idx [list]: indices of the N images
    • label [string]: class string
    • batch_size [int]: number of images to plot
  • OPTIONAL INPUTS:
    • rows=8 [int]: number of rows
    • cols=8 [int]: number of columns
    • size=(20,15) [tuple]: size of matplotlib figure
  • GLOBAL INPUTS: None (matplotlib figure, printed to file)

compute_mean_image
compute_mean_image(images, opt="mean")

Compute and return mean image given a batch of images

  • INPUTS:
    • images [ndarray]: batch of shape (N x W x H x 3)
  • OPTIONAL INPUTS:
    • opt="mean" or "median"
  • GLOBAL INPUTS:
  • OUTPUTS: 2d mean image [ndarray]

plot_mean_images
plot_mean_images(images, labels, CLASSES, rows=3, cols = 2)

Plot the mean image of a set of images

  • INPUTS:
    • images [ndarray]: batch of shape (N x W x H x 3)
    • labels [ndarray]: batch of shape (N x 1)
  • OPTIONAL INPUTS:
    • rows [int]: number of rows
    • cols [int]: number of columns
  • GLOBAL INPUTS: CLASSES
  • OUTPUTS: matplotlib figure

plot_tsne
plot_tsne(tsne_result, label_ids, CLASSES)

Plot TSNE loadings and colour code by class. Source

  • INPUTS:
    • tsne_result [ndarray]: N x 2 data of loadings on two axes
    • label_ids [int]: N class labels
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: CLASSES
  • OUTPUTS: matplotlib figure, matplotlib figure axes object

visualize_scatter_with_images
visualize_scatter_with_images(X_2d_data, labels, images, figsize=(15,15), image_zoom=1,xlim = (-3,3), ylim=(-3,3))

Plot TSNE loadings and colour code by class. Source

  • INPUTS:
    • X_2d_data [ndarray]: N x 2 data of loadings on two axes
    • images [ndarray] : N batch of images to plot
  • OPTIONAL INPUTS:
    • figsize=(15,15)
    • image_zoom=1 [float]: control the scaling of the imagery (make smaller for smaller thumbnails)
    • xlim = (-3,3) [tuple]: set x axes limits
    • ylim = (-3,3) [tuple]: set y axes limits]
  • GLOBAL INPUTS: None
  • OUTPUTS: matplotlib figure

2_ObjRecog

General workflow using your own data

  1. Create a TFREcord dataset from your data, organised as follows:
  • copy training images into a folder called train
  • copy validation images into a folder called validation
  • create a text, csv file that lists each of the objects in each image, with the following columns: filename, xmin, ymin, xmax (float y coord pixel), ymax (float y coord pixel), class (string)
  • modify the provided workflow (secoora_make_tfrecords.py) for your dataset, to create your train and validation tfrecord shards
  filename, width,  height, class,  xmin,   ymin,   xmax,   ymax
  staugustinecam.2019-04-18_1400.mp4_frames_25.jpg, 1280,   720,    person, 1088,   581,    1129,   631
  staugustinecam.2019-04-18_1400.mp4_frames_25.jpg, 1280,   720,    person, 1125,   524,    1183,   573
  staugustinecam.2019-04-04_0700.mp4_frames_51.jpg, 1280,   720,    person, 158,    198,    178,    244
  staugustinecam.2019-04-04_0700.mp4_frames_51.jpg, 1280,   720,    person, 131,    197,    162,    244
  staugustinecam.2019-04-04_0700.mp4_frames_51.jpg, 1280,   720,    person, 40, 504,    87, 581
  staugustinecam.2019-04-04_0700.mp4_frames_51.jpg, 1280,   720,    person, 0,  492,    15, 572
  staugustinecam.2019-01-01_1400.mp4_frames_44.jpg, 1280,   720,    person, 1086,   537,    1130,   615
  staugustinecam.2019-01-01_1400.mp4_frames_44.jpg, 1280,   720,    person, 1064,   581,    1134,   624
  staugustinecam.2019-01-01_1400.mp4_frames_44.jpg, 1280,   720,    person, 1136,   526,    1186,   570
  1. Set up your model
  • Decide on whether you want to train a model from scratch, or trained using weights transfered from another task (such as coco 2017)
  1. Set up a model training pipeline
  • .compile() your model with an appropriate loss function and metrics
  • define a LearningRateScheduler function to vary learning rates over training as a function of training epoch
  • define an EarlyStopping criteria and create a ModelCheckpoint to save trained model weights
  • if transfer learning using weights not from coco, load your initial weights from somewhere else
  1. Train the model
  • Use history = model.fit() to create a record of the training history. Pass the training and validation datasets, and a list of callbacks containing your model checkpoint, learning rate scheduler, and early stopping monitor)
  1. Evaluate your model
  • Plot and study the history time-series of losses and metrics. If unsatisfactory, begin the iterative process of model optimization
  • Use the loss, accuracy = model.evaluate(get_validation_dataset(), batch_size=BATCH_SIZE, steps=validation_steps) function using the validation dataset and specifying the number of validation steps
  • Make plots of model outputs, organized in such a way that you can at-a-glance see where the model is failing. Make use of visualize_detections, as a starting point, to visualize sample imagery with their model predictions

model_funcs.py

Model creation


AnchorBox
AnchorBox()

Code from https://keras.io/examples/vision/retinanet/. Generates anchor boxes. This class has operations to generate anchor boxes for feature maps at strides [8, 16, 32, 64, 128]. Where each anchor each box is of the format [x, y, width, height].

  • INPUTS:
    • aspect_ratios: A list of float values representing the aspect ratios of the anchor boxes at each location on the feature map
    • scales: A list of float values representing the scale of the anchor boxes at each location on the feature map.
    • num_anchors: The number of anchor boxes at each location on feature map
    • areas: A list of float values representing the areas of the anchor boxes for each feature map in the feature pyramid.
    • strides: A list of float value representing the strides for each feature map in the feature pyramid.
  • OPTIONAL INPUTS: None
  • OUTPUTS: anchor boxes for all the feature maps, stacked as a single tensor with shape (total_anchors, 4), when AnchorBox._get_anchors() is called
  • GLOBAL INPUTS: None

get_backbone
get_backbone()

Code from https://keras.io/examples/vision/retinanet/. This function Builds ResNet50 with pre-trained imagenet weights

  • INPUTS: None
  • OPTIONAL INPUTS: None
  • OUTPUTS: keras Model
  • GLOBAL INPUTS: BATCH_SIZE

FeaturePyramid
FeaturePyramid()

Code from https://keras.io/examples/vision/retinanet/. This class builds the Feature Pyramid with the feature maps from the backbone.

  • INPUTS:
    • num_classes: Number of classes in the dataset.
    • backbone: The backbone to build the feature pyramid from. Currently supports ResNet50 only (the output of get_backbone())
  • OPTIONAL INPUTS: None
  • OUTPUTS: the 5-feature pyramids (feature maps) at strides [8, 16, 32, 64, 128]
  • GLOBAL INPUTS: None

build_head
build_head(output_filters, bias_init)

Code from https://keras.io/examples/vision/retinanet/. This function builds the class/box predictions head.

  • INPUTS:
    • output_filters: Number of convolution filters in the final layer.
    • bias_init: Bias Initializer for the final convolution layer.
  • OPTIONAL INPUTS: None
  • OUTPUTS: a keras sequential model representing either the classification or the box regression head depending on output_filters.
  • GLOBAL INPUTS: None

RetinaNet
RetinaNet()

Code from https://keras.io/examples/vision/retinanet/. This class returns a subclassed Keras model implementing the RetinaNet architecture.

  • INPUTS:
    • num_classes: Number of classes in the dataset.
    • backbone: The backbone to build the feature pyramid from. Supports ResNet50 only.
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • GLOBAL INPUTS: None

Model training


compute_iou
compute_iou(boxes1, boxes2)

This function computes pairwise IOU matrix for given two sets of boxes

  • INPUTS:
    • boxes1: A tensor with shape (N, 4) representing bounding boxes where each box is of the format [x, y, width, height].
    • boxes2: A tensor with shape (M, 4) representing bounding boxes where each box is of the format [x, y, width, height].
  • OPTIONAL INPUTS: None
  • OUTPUTS: pairwise IOU matrix with shape (N, M), where the value at ith row jth column holds the IOU between ith box and jth box from boxes1 and boxes2 respectively.
  • GLOBAL INPUTS: None

RetinaNetBoxLoss
RetinaNetBoxLoss()

Code from https://keras.io/examples/vision/retinanet/. This class implements smooth L1 loss

  • INPUTS:
    • y_true [tensor]: label observations
    • y_pred [tensor]: label estimates
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • loss [tensor]
  • GLOBAL INPUTS: None

RetinaNetClassificationLoss
RetinaNetClassificationLoss()

Code from https://keras.io/examples/vision/retinanet/. This class implements Focal loss.

  • INPUTS:
    • y_true [tensor]: label observations
    • y_pred [tensor]: label estimates
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • loss [tensor]
  • GLOBAL INPUTS: None

RetinaLoss
RetinaNetLoss()

Code from https://keras.io/examples/vision/retinanet/. This class is a wrapper to sum RetinaNetClassificationLoss and RetinaNetClassificationLoss outputs.

  • INPUTS:
    • y_true [tensor]: label observations
    • y_pred [tensor]: label estimates
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • loss [tensor]
  • GLOBAL INPUTS: None

Model prediction


DecodePredictions
DecodePredictions()

Code from https://keras.io/examples/vision/retinanet/. This class creates a Keras layer that decodes predictions of the RetinaNet model.

  • INPUTS:
    • num_classes: Number of classes in the dataset
    • confidence_threshold: Minimum class probability, below which detections are pruned.
    • nms_iou_threshold: IOU threshold for the NMS operation
    • max_detections_per_class: Maximum number of detections to retain per class.
    • max_detections: Maximum number of detections to retain across all classes.
    • box_variance: The scaling factors used to scale the bounding box predictions.
  • OPTIONAL INPUTS: None
  • OUTPUTS: a keras layer to decode predictions
  • GLOBAL INPUTS: None

data_funcs.py


random_flip_horizontal
random_flip_horizontal(image, boxes)

Flips image and boxes horizontally with 50% chance

  • INPUTS:
    • image: A 3-D tensor of shape (height, width, channels) representing an image.
    • boxes: A tensor with shape (num_boxes, 4) representing bounding boxes, having normalized coordinates.
  • OUTPUTS: Randomly flipped image and boxes

resize_and_pad_image
resize_and_pad_image(image, min_side=800.0, max_side=1333.0, jitter=[640, 1024], stride=128.0)

Resizes and pads image while preserving aspect ratio.

  1. Resizes images so that the shorter side is equal to min_side
  2. If the longer side is greater than max_side, then resize the image with longer side equal to max_side
  3. Pad with zeros on right and bottom to make the image shape divisible by stride
  • INPUTS:
    • image: A 3-D tensor of shape (height, width, channels) representing an image.
    • min_side: The shorter side of the image is resized to this value, if jitter is set to None.
    • max_side: If the longer side of the image exceeds this value after resizing, the image is resized such that the longer side now equals to this value.
    • jitter: A list of floats containing minimum and maximum size for scale jittering. If available, the shorter side of the image will be resized to a random value in this range.
    • stride: The stride of the smallest feature map in the feature pyramid. Can be calculated using image_size / feature_map_size.
  • OUTPUTS: image: Resized and padded image. image_shape: Shape of the image before padding. ratio: The scaling factor used to resize the image

preprocess_secoora_data
preprocess_secoora_data(example)

This function preprocesses a secoora dataset for training

  • INPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • GLOBAL INPUTS: None

preprocess_coco_data
preprocess_coco_data(sample)

Applies preprocessing step to a single sample

  • INPUTS:
    • sample: A dict representing a single training sample.
    • OPTIONAL INPUTS: None
  • OUTPUTS:
    • image: Resized and padded image with random horizontal flipping applied.
    • bbox: Bounding boxes with the shape (num_objects, 4) where each box is of the format [x, y, width, height].
    • class_id: An tensor representing the class id of the objects, having shape (num_objects,).

swap_xy
swap_xy(boxes)

Swaps order the of x and y coordinates of the boxes.

  • INPUTS: boxes: A tensor with shape (num_boxes, 4) representing bounding boxes.
  • OUTPUTS: swapped boxes with shape same as that of boxes.

convert_to_xywh
convert_to_xywh(boxes)

Changes the box format to center, width and height.

  • INPUTS:
    • boxes: A tensor of rank 2 or higher with a shape of (..., num_boxes, 4) representing bounding boxes where each box is of the format [xmin, ymin, xmax, ymax].
  • OUTPUTS: converted boxes with shape same as that of boxes.

convert_to_corners
convert_to_corners(boxes)

Changes the box format to corner coordinates

  • INPUTS:
    • boxes: A tensor of rank 2 or higher with a shape of (..., num_boxes, 4) representing bounding boxes where each box is of the format [x, y, width, height].
  • OUTPUTS: converted boxes with shape same as that of boxes.

compute_iou
compute_iou(boxes1, boxes2)

This function computes pairwise IOU matrix for given two sets of boxes

  • INPUTS:
    • boxes1: A tensor with shape (N, 4) representing bounding boxes where each box is of the format [x, y, width, height].
    • boxes2: A tensor with shape (M, 4) representing bounding boxes where each box is of the format [x, y, width, height].
  • OPTIONAL INPUTS: None
  • OUTPUTS:pairwise IOU matrix with shape (N, M), where the value at ith row jth column holds the IOU between ith box and jth box from boxes1 and boxes2 respectively.
  • GLOBAL INPUTS: None
AnchorBox
AnchorBox()

Code from https://keras.io/examples/vision/retinanet/. Generates anchor boxes. This class has operations to generate anchor boxes for feature maps at strides [8, 16, 32, 64, 128]. Where each anchor each box is of the format [x, y, width, height].

  • INPUTS:
    • aspect_ratios: A list of float values representing the aspect ratios of the anchor boxes at each location on the feature map
    • scales: A list of float values representing the scale of the anchor boxes at each location on the feature map.
    • num_anchors: The number of anchor boxes at each location on feature map
    • areas: A list of float values representing the areas of the anchor boxes for each feature map in the feature pyramid.
    • strides: A list of float value representing the strides for each feature map in the feature pyramid.
  • OPTIONAL INPUTS: None
  • OUTPUTS: anchor boxes for all the feature maps, stacked as a single tensor with shape (total_anchors, 4), when AnchorBox._get_anchors() is called
  • GLOBAL INPUTS: None

LabelEncoderCoco
LabelEncoderCoco()

Transforms the raw labels into targets for training. This class has operations to generate targets for a batch of samples which is made up of the input images, bounding boxes for the objects present and their class ids.

  • INPUTS:
    • anchor_box: Anchor box generator to encode the bounding boxes.
    • box_variance: The scaling factors used to scale the bounding box targets.

tfrecords_funcs.py

TF-dataset creation


prepare_image
prepare_image(image)

This function resizes and pads an image, and rescales for resnet

  • INPUTS:
    • image [tensor array]
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • image [tensor array] GLOBAL INPUTS: None

prepare_secoora_datasets_for_training
prepare_secoora_datasets_for_training(data_path, val_filenames)

This function prepares train and validation datasets by extracting features (images, bounding boxes, and class labels) then map to preprocess_secoora_data, then apply prefetch, padded batch and label encoder

  • INPUTS:
    • data_path [string]: path to the tfrecords
    • train_filenames [string]: tfrecord filenames for training
    • val_filenames [string]: tfrecord filenames for validation
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • GLOBAL INPUTS: None

prepare_secoora_datasets_for_training
prepare_secoora_datasets_for_training(data_path, train_filenames, val_filenames)

This function prepares train and validation datasets by extracting features (images, bounding boxes, and class labels) then map to preprocess_secoora_data, then apply prefetch, padded batch and label encoder

  • INPUTS:
    • data_path [string]: path to the tfrecords
    • train_filenames [string]: tfrecord filenames for training
    • val_filenames [string]: tfrecord filenames for validation
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • GLOBAL INPUTS: None

LabelEncoderCoco
prepare_coco_datasets_for_training(train_dataset, val_dataset)

This function prepares a coco dataset loaded from tfds into one trainable by the model

  • INPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • GLOBAL INPUTS: BATCH_SIZE

file2tensor
file2tensor(f)

This function reads a jpeg image from file into a cropped and resized tensor, for use in prediction with a trained mobilenet or vgg model (the imagery is standardized depending on target model framework)

  • INPUTS:
    • f [string] file name of jpeg
  • OPTIONAL INPUTS:
    • model = {'mobilenet' | 'vgg'}
  • OUTPUTS:
    • image [tensor array]: unstandardized image
    • im [tensor array]: standardized image
  • GLOBAL INPUTS: TARGET_SIZE

TFRecord reading


write_tfrecords
write_tfrecords(output_path, image_dir, csv_input)

This function writes tfrecords to fisk

  • INPUTS:
    • image_dir [string]: place where jpeg images are
    • csv_input [string]: csv file that contains the labels
    • output_path [string]: place to writes files to
  • OPTIONAL INPUTS: None
  • OUTPUTS: None (tfrecord files written to disk)
  • GLOBAL INPUTS: BATCH_SIZE

class_text_to_int
class_text_to_int(row_label)

This function converts the string 'person' into the number 1

  • INPUTS:
    • row_label [string]: class label string
  • OPTIONAL INPUTS: None
  • OUTPUTS: 1 or None
  • GLOBAL INPUTS: BATCH_SIZE

split
split(df, group)

This function splits a pandas dataframe by a pandas group object to extract the label sets from each image for writing to tfrecords

  • INPUTS:
    • df [pandas dataframe]
    • group [pandas dataframe group object]
  • OPTIONAL INPUTS: None
  • OUTPUTS: tuple of bboxes and classes per image
  • GLOBAL INPUTS: BATCH_SIZE

create_tf_example_coco
create_tf_example_coco(group, path)

This function creates an example tfrecord consisting of an image and label encoded as bytestrings. The jpeg image is read into a bytestring, and the bbox coordinates and classes are collated and converted also.

  • INPUTS:
    • group [pandas dataframe group object]
    • path [tensorflow dataset]: training dataset
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • tf_example [tf.train.Example object]
  • GLOBAL INPUTS: BATCH_SIZE

plot_funcs.py


plot_history
plot_history(history, train_hist_fig)

This function plots the training history of a model

  • INPUTS:
    • history [dict]: the output dictionary of the model.fit() process, i.e. history = model.fit(...)
    • train_hist_fig [string]: the filename where the plot will be printed
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (figure printed to file)

visualize_detections
visualize_detections(image, boxes, classes, scores, counter, str_prefix, figsize=(7, 7), linewidth=1, color=[0, 0, 1])

This function allows for visualization of imagery and bounding boxes

  • INPUTS:
    • images [ndarray]: batch of images
    • boxes [ndarray]: batch of bounding boxes per image
    • classes [list]: class strings
    • scores [list]: prediction scores
    • str_prefix [string]: filename prefix
  • OPTIONAL INPUTS:
    • figsize=(7, 7), figure size
    • linewidth=1, box line width
    • color=[0, 0, 1], box colour
  • OUTPUTS:
    • val_dataset [tensorflow dataset]: validation dataset
    • train_dataset [tensorflow dataset]: training dataset
  • GLOBAL INPUTS: None

3_ImageSeg

General workflow using your own data

model_funcs.py

Model creation


batchnorm_act
batchnorm_act(x)

This function applies batch normalization to a keras model layer, x, then a relu activation function

  • INPUTS:
    • z : keras model layer (should be the output of a convolution or an input layer)
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • batch normalized and relu-activated x

conv_block
conv_block(x, filters, kernel_size=(3, 3), padding="same", strides=1)

This function applies batch normalization to an input layer, then convolves with a 2D convol layer The two actions combined is called a convolutional block

  • INPUTS:
    • filters: number of filters in the convolutional block
    • x:input keras layer to be convolved by the block
  • OPTIONAL INPUTS:
    • kernel_size=(3, 3): tuple of kernel size (x, y) - this is the size in pixels of the kernel to be convolved with the image
    • padding="same": see tf.keras.layers.Conv2D
    • strides=1: see tf.keras.layers.Conv2D
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • keras layer, output of the batch normalized convolution

bottleneck_block
bottleneck_block(x, filters, kernel_size=(3, 3), padding="same", strides=1)

This function creates a bottleneck block layer, which is the addition of a convolution block and a batch normalized/activated block

  • INPUTS:
    • filters: number of filters in the convolutional block
    • x: input keras layer
  • OPTIONAL INPUTS:
    • kernel_size=(3, 3): tuple of kernel size (x, y) - this is the size in pixels of the kernel to be convolved with the image
    • padding="same": see tf.keras.layers.Conv2D
    • strides=1: see tf.keras.layers.Conv2D
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • keras layer, output of the addition between convolutional and bottleneck layers

res_block
res_block(x, filters, kernel_size=(3, 3), padding="same", strides=1)

This function creates a residual block layer, which is the addition of a residual convolution block and a batch normalized/activated block

  • INPUTS:
    • filters: number of filters in the convolutional block
    • x: input keras layer
  • OPTIONAL INPUTS:
    • kernel_size=(3, 3): tuple of kernel size (x, y) - this is the size in pixels of the kernel to be convolved with the image
    • padding="same": see tf.keras.layers.Conv2D
    • strides=1: see tf.keras.layers.Conv2D
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • keras layer, output of the addition between residual convolutional and bottleneck layers

upsamp_concat_block
upsamp_concat_block(x, xskip)

This function takes an input layer and creates a concatenation of an upsampled version and a residual or 'skip' connection

  • INPUTS:
    • xskip: input keras layer (skip connection)
    • x: input keras layer
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • keras layer, output of the addition between residual convolutional and bottleneck layers

res_unet
res_unet(sz, f, flag, nclasses=1)

This function creates a custom residual U-Net model for image segmentation

  • INPUTS:
    • sz: [tuple] size of input image
    • f: [int] number of filters in the convolutional block
    • flag: [string] if 'binary', the model will expect 2D masks and uses sigmoid. If 'multiclass', the model will expect 3D masks and uses softmax
    • nclasses [int]: number of classes
  • OPTIONAL INPUTS:
    • kernel_size=(3, 3): tuple of kernel size (x, y) - this is the size in pixels of the kernel to be convolved with the image
    • padding="same": see tf.keras.layers.Conv2D
    • strides=1: see tf.keras.layers.Conv2D
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • keras model

Model training


metrics_np
metrics_np(y_true, y_pred, metric_name, metric_type='standard', drop_last = True, mean_per_class=False, verbose=False)

Compute mean metrics of two segmentation masks, via numpy.

IoU(A,B) = |A & B| / (| A U B|)
Dice(A,B) = 2*|A & B| / (|A| + |B|)
  • INPUTS:
    • y_true: true masks, one-hot encoded. Inputs are BWH*N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
    • y_pred: predicted masks, either softmax outputs, or one-hot encoded. Inputs are BWH*N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
    • metric_name: metric to be computed, either 'iou' or 'dice'.
    • metric_type: one of 'standard' (default), 'soft', 'naive'. In the standard version, y_pred is one-hot encoded and the mean is taken only over classes that are present (in y_true or y_pred). The 'soft' version of the metrics are computed without one-hot encoding y_pred. The 'naive' version return mean metrics where absent classes contribute to the class mean as 1.0 (instead of being dropped from the mean).
    • drop_last = True: boolean flag to drop last class (usually reserved for background class in semantic segmentation)
    • mean_per_class = False: return mean along batch axis for each class.
    • verbose = False: print intermediate results such as intersection, union (as number of pixels).
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • IoU/Dice of y_true and y_pred [float] unless mean_per_class == True in which case it returns the per-class metric, averaged over the batch.

mean_iou_np
mean_iou_np(y_true, y_pred)

This function calls metrics_np to compute IoU


mean_iou
mean_iou(y_true, y_pred)

This function computes the mean IoU between y_true and y_pred: this version is tensorflow (not numpy) and is used by tensorflow training and evaluation functions

  • INPUTS:
    • y_true: true masks, one-hot encoded. Inputs are B * W * H * N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
    • y_pred: predicted masks, either softmax outputs, or one-hot encoded. Inputs are B * W * H * N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • IoU score [tensor]

dice_coef
dice_coef(y_true, y_pred)

This function computes the mean Dice coefficient between y_true and y_pred: this version is tensorflow (not numpy) and is used by tensorflow training and evaluation functions

  • INPUTS:
    • y_true: true masks, one-hot encoded. Inputs are B * W * H * N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
    • y_pred: predicted masks, either softmax outputs, or one-hot encoded. Inputs are B * W * H * N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • Dice score [tensor]

dice_coef_loss
dice_coef_loss(y_true, y_pred)

This function computes the mean Dice loss (1 - dice coefficient) between y_true and y_pred: this version is tensorflow (not numpy) and is used by tensorflow training and evaluation functions

  • INPUTS:
    • y_true: true masks, one-hot encoded. Inputs are B * W * H * N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
    • y_pred: predicted masks, either softmax outputs, or one-hot encoded. Inputs are B * W * H * N tensors, with
      • B = batch size,
      • W = width,
      • H = height,
      • N = number of classes
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • Dice loss [tensor]

tfrecords_funcs.py

TF-dataset creation


get_batched_dataset_oysternet
get_batched_dataset_oysternet(filenames)

This function defines a workflow for the model to read data from tfrecord files by defining the degree of parallelism, batch size, pre-fetching, etc and also formats the imagery properly for model training (assumes oysternet by using read_seg_tfrecord_oysternet)

  • INPUTS:
    • filenames [list]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: BATCH_SIZE, AUTO
  • OUTPUTS: tf.data.Dataset object

get_batched_dataset_obx
get_batched_dataset_obx(filenames, flag)

This function defines a workflow for the model to read data from tfrecord files by defining the degree of parallelism, batch size, pre-fetching, etc and also formats the imagery properly for model training

If input flag is 'binary', read_seg_tfrecord_obx_binary is used to read tfrecords and parse into two categories (deep vs everything else)

If input flag is 'multiclass', read_seg_tfrecord_obx_multiclass is used to parse tfrecords into 4 classes,. recoded 0 through 3

  • INPUTS:
    • filenames [list]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: BATCH_SIZE, AUTO
  • OUTPUTS: tf.data.Dataset object

write_seg_records_obx
write_seg_records_obx(dataset, tfrecord_dir)

This function writes a tf.data.Dataset object to TFRecord shards. The version for OBX data preprends "obx" to the filenames, but otherwise is identical to write_seg_records

  • INPUTS:
    • dataset [tf.data.Dataset]
    • tfrecord_dir [string] : path to directory where files will be written
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (files written to disk)

write_seg_records_oysternet
write_seg_records_oysternet(dataset, tfrecord_dir, filestr)

This function writes a tf.data.Dataset object to TFRecord shards

  • INPUTS:
    • dataset [tf.data.Dataset]
    • tfrecord_dir [string] : path to directory where files will be written
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (files written to disk)

TFRecord reading


read_seg_tfrecord_obx_binary
read_seg_tfrecord_obx_binary(example)

This function reads an example from a TFrecord file into a single image and label. In this binary image creator for OBX, input 4-class imagery is binarized based on 0=63=deep, 1=128=broken, 1=191=shallow, 1=255=dry

  • INPUTS:
    • TFRecord example object
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor array]

read_seg_tfrecord_obx_multiclass
read_seg_tfrecord_obx_multiclass(example)

This function reads an example from a TFrecord file into a single image and label. This is the "multiclass" version for OBS imagery, where the classes are mapped as follows: 0=63=deep, 2=128=broken, 3=191=shallow, 4=255=dry

  • INPUTS:
    • TFRecord example object
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor array]

get_seg_dataset_for_tfrecords_oysternet
get_seg_dataset_for_tfrecords_oysternet(imdir, lab_path, shared_size)

This function reads an image and label and decodes both jpegs into bytestring arrays. This works because the images and labels have the same name but different paths, hence tf.strings.regex_replace(img_path, "images", "labels")

  • INPUTS:
    • image [tensor array]
    • label [tensor array]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • label [tensor array]

get_seg_dataset_for_tfrecords_obx
get_seg_dataset_for_tfrecords_obx(imdir, lab_path, shared_size)

This function reads an image and label and decodes both jpegs into bytestring arrays. This is the version for OBX data, which differs in use of both resize_and_crop_seg_image_obx and resize_and_crop_seg_image_obx for image pre-processing

  • INPUTS:
    • image [tensor array]
    • label [tensor array]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • label [tensor array]

read_seg_image_and_label
read_seg_image_and_label(img_path)

This function reads an image and label and decodes both jpegs into bytestring arrays. This works because the images and labels have the same name but different paths, hence tf.strings.regex_replace(img_path, "images", "labels")

  • INPUTS:
    • img_path [tensor string]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [bytestring]
    • label [bytestring]

read_seg_image_and_label_obx
read_seg_image_and_label_obx(img_path)

This function reads an image and label and decodes both jpegs into bytestring arrays. This works by parsing out the label image filename from its image pair There are different rules for non-augmented versus augmented imagery

  • INPUTS:
    • img_path [tensor string]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [bytestring]
    • label [bytestring]

resize_and_crop_seg_image
resize_and_crop_seg_image(image, label)

This function crops to square and resizes an image and label

  • INPUTS:
    • image [tensor array]
    • label [tensor array]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • label [tensor array]

resize_and_crop_seg_image_obx
resize_and_crop_seg_image_obx(image, label)

This function crops to square and resizes an image and label

  • INPUTS:
    • image [tensor array]
    • label [tensor array]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • label [tensor array]

recompress_seg_image
recompress_seg_image(image, label)

This function takes an image and label encoded as a byte string and recodes as an 8-bit jpeg

  • INPUTS:
    • image [tensor array]
    • label [tensor array]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]
    • label [tensor array]

read_seg_tfrecord_oysternet
read_seg_tfrecord_oysternet(example)

This function reads an example from a TFrecord file into a single image and label

  • INPUTS:
    • TFRecord example object
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor array]

seg_file2tensor
seg_file2tensor(f)

This function reads a jpeg image from file into a cropped and resized tensor, for use in prediction with a trained segmentation model

  • INPUTS:
    • f [string] file name of jpeg
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]: unstandardized image
  • GLOBAL INPUTS: TARGET_SIZE

plot_funcs.py


make_sample_ensemble_seg_plot
make_sample_ensemble_seg_plot(model2, model3, sample_filenames, test_samples_fig, flag='binary')

This function uses two trained models to estimate the label image from each input image It then uses a KL score to determine which one to return and returns both images and labels as a list, as well as a list of which model's output is returned

  • INPUTS:
    • model: trained and compiled keras model
    • sample_filenames: [list] of strings
    • test_samples_fig [string]: filename to print figure to
    • flag [string]: either 'binary' or 'multiclass'
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • imgs: [list] of images
    • lbls: [list] of label images
    • model_num: [list] of integers indicating which model's output was retuned based on CRF KL divergence

make_sample_seg_plot
make_sample_seg_plot(model, sample_filenames, test_samples_fig, flag='binary')

This function uses a trained model to estimate the label image from each input image and returns both images and labels as a list

  • INPUTS:
    • model: trained and compiled keras model
    • sample_filenames: [list] of strings
    • test_samples_fig [string]: filename to print figure to
    • flag [string]: either 'binary' or 'multiclass'
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • imgs: [list] of images
    • lbls: [list] of label images

plot_seg_history
plot_seg_history(history, train_hist_fig)

This function plots the training history of a model

  • INPUTS:
    • history [dict]: the output dictionary of the model.fit() process, i.e. history = model.fit(...)
    • train_hist_fig [string]: the filename where the plot will be printed
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (figure printed to file)

plot_seg_history_iou
plot_seg_history_iou(history, train_hist_fig)

This function plots the training history of a model

  • INPUTS:
    • history [dict]: the output dictionary of the model.fit() process, i.e. history = model.fit(...)
    • train_hist_fig [string]: the filename where the plot will be printed
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (figure printed to file)

crf_refine
crf_refine(label, img)

This function refines a label image based on an input label image and the associated image Uses a conditional random field algorithm using spatial and image features

  • INPUTS:
    • label [ndarray]: label image 2D matrix of integers
    • image [ndarray]: image 3D matrix of integers
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: label [ndarray]: label image 2D matrix of integers

4_UnsupImageRecog

General workflow using your own data

  1. Create a TFREcord dataset from your data, organised as follows:
  • Copy training images into a folder called train
  • Copy validation images into a folder called validation
  • Ensure the class name is written to each file name. Ideally this is a prefix such that it is trivial to extract the class name from the file name
  • Modify one of the provided workflows (such as tamucc_make_tfrecords.py) for your dataset, to create your train and validation tfrecord shards
  1. Set up your model
  • Decide on whether you want to train a small or large embedding model (get_embedding_model or get_large_embedding_model)
  1. Set up a data pipeline
  • Modify and follow the provided examples to create a get_training_dataset() and get_validation_dataset(). This will likely require you copy and modify get_batched_dataset to your own needs, depending on the format of your labels in filenames, by writing your own read_tfrecord function for your dataset (depending on the model selected)
  • Remember for this method you have to read all the data at once into memory, which isn't ideal. You may therefore need to modify get_data_stuff to be a more efficient way to do so for your data
  1. Set up a model training pipeline
  • .compile() your model with an appropriate loss function and metrics
  • Define a LearningRateScheduler function to vary learning rates over training as a function of training epoch
  • Define an EarlyStopping criteria and create a ModelCheckpoint to save trained model weights
  1. Train the autoencoder embedding model
  • Use history = model.fit() to create a record of the training history. Pass the training and validation datasets, and a list of callbacks containing your model checkpoint, learning rate scheduler, and early stopping monitor)
  1. Train the k-nearest neighbour classifer
  • Decide or determine the optimal number of neighbours (k)
  • Use fit_knn_to_embeddings to make a model of your training embeddings
  1. Evaluate your model
  • Plot and study the history time-series of losses and metrics. If unsatisfactory, begin the iterative process of model optimization
  • Use the loss, accuracy = model.evaluate(get_validation_dataset(), batch_size=BATCH_SIZE, steps=validation_steps) function using the validation dataset and specifying the number of validation steps
  • Make plots of model outputs, organized in such a way that you can at-a-glance see where the model is failing. Make use of make_sample_plot and p_confmat, as a starting point, to visualize sample imagery with their model predictions, and a confusion matrix of predicted/true class-correspondences
  • On the test set, play tf.nn.l2_normalize (i.e. don't use it on test and/or train embeddings and see if it improves results)

model_funcs.py

Model creation


EmbeddingModel
EmbeddingModel()

Code modified from https://keras.io/examples/vision/metric_learning/. This class allows an embedding model (an get_embedding_model or get_large_embedding_model instance) to be trainable using the conventional model.fit(), whereby it can be passed another class that provides batches of data examples in the form of anchors, positives, and negatives

  • INPUTS: None
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: model training metrics

get_large_embedding_model
get_large_embedding_model(TARGET_SIZE, num_classes, num_embed_dim)

Code modified from https://keras.io/examples/vision/metric_learning/. This function makes an instance of a larger embedding model, which is a keras sequential model consisting of 5 convolutiional blocks, average 2d pooling, and an embedding layer

  • INPUTS:
    • model [keras model]
    • X_train [list]
    • ytrain [list]
    • num_dim_use [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • knn [sklearn knn model]

get_embedding_model
get_embedding_model(TARGET_SIZE, num_classes, num_embed_dim)

Code modified from https://keras.io/examples/vision/metric_learning/. This function makes an instance of an embedding model, which is a keras sequential model consisting of 3 convolutiional blocks, average 2d pooling, and an embedding layer

  • INPUTS:
    • model [keras model]
    • X_train [list]
    • ytrain [list]
    • num_dim_use [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • knn [sklearn knn model]

Model training


fit_knn_to_embeddings
fit_knn_to_embeddings(model, X_train, ytrain, n_neighbors)

This function computes a confusion matrix (matrix of correspondences between true and estimated classes) using the sklearn function of the same name. Then normalizes by column totals, and makes a heatmap plot of the matrix saving out to the provided filename, cm_filename

  • INPUTS:
    • model [keras model]
    • X_train [list]
    • ytrain [list]
    • num_dim_use [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • knn [sklearn knn model]

weighted_binary_crossentropy
weighted_binary_crossentropy(zero_weight, one_weight)

This function computes weighted binary crossentropy loss

  • INPUTS:
    • zero_weight [float]: weight for the zero class
    • one_weight [float]: weight for the one class
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: the function wbce with all arguments passed

lrfn
lrfn(epoch)

This function creates a custom piecewise linear-exponential learning rate function for a custom learning rate scheduler. It is linear to a max, then exponentially decays

  • INPUTS: current epoch number
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS:start_lr, min_lr, max_lr, rampup_epochs, sustain_epochs, exp_decay
  • OUTPUTS: the function lr with all arguments passed

tfrecords_funcs.py

TF-dataset creation


get_batched_dataset
get_batched_dataset(filenames)

This function defines a workflow for the model to read data from tfrecord files by defining the degree of parallelism, batch size, pre-fetching, etc and also formats the imagery properly for model training (assumes mobilenet by using read_tfrecord_mv2)

  • INPUTS:
    • filenames [list]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: BATCH_SIZE, AUTO
  • OUTPUTS: tf.data.Dataset object

get_data_stuff
get_data_stuff(ds, num_batches)

This function extracts lists of images and corresponding labels for training or testing

  • INPUTS:
    • ds [PrefetchDataset]: either get_training_dataset() or get_validation_dataset()
    • num_batches [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • X [list]
    • y [list]
    • class_idx_to_train_idxs [collections.defaultdict]

recompress_image
recompress_image(image, label)

This function takes an image encoded as a byte string and recodes as an 8-bit jpeg. Label passes through unmodified

  • INPUTS:
    • image [tensor array]
    • label [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]
    • label [int]

resize_and_crop_image
resize_and_crop_image(image, label)

This function crops to square and resizes an image. The label passes through unmodified

  • INPUTS:
    • image [tensor array]
    • label [int]
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • label [int]

to_tfrecord
to_tfrecord(img_bytes, label, CLASSES)

This function creates a TFRecord example from an image byte string and a label feature

  • INPUTS:
    • img_bytes
    • label
    • CLASSES
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: tf.train.Feature example

get_dataset_for_tfrecords
get_dataset_for_tfrecords(recoded_dir, shared_size)

This function reads a list of TFREcord shard files, decode the images and label, resize and crop the image to TARGET_SIZE and creates batches

  • INPUTS:
    • recoded_dir
    • shared_size
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS: tf.data.Dataset object

write_records
write_records(tamucc_dataset, tfrecord_dir, CLASSES)

This function writes a tf.data.Dataset object to TFRecord shards

  • INPUTS:
    • tamucc_dataset [tf.data.Dataset]
    • tfrecord_dir [string] : path to directory where files will be written
    • CLASSES [list] of class string names
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: None (files written to disk)

TFRecord reading


read_classes_from_json
read_classes_from_json(json_file)

This function reads the contents of a json file enumerating classes

  • INPUTS:
    • json_file [string]: full path to the json file
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS: CLASSES [list]: list of classesd as byte strings

file2tensor
file2tensor(f)

This function reads a jpeg image from file into a cropped and resized tensor, for use in prediction with a trained mobilenet or vgg model (the imagery is standardized depedning on target model framework)

  • INPUTS:
    • f [string] file name of jpeg
  • OPTIONAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]: unstandardized image
    • im [tensor array]: standardized image
  • GLOBAL INPUTS: TARGET_SIZE

read_tfrecord
read_tfrecord(example)

This function reads an example from a TFrecord file into a single image and label

  • INPUTS:
    • TFRecord example object
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: TARGET_SIZE
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor int]

read_image_and_label
read_image_and_label(img_path)

This function reads a jpeg image from a provided filepath and extracts the label from the filename (assuming the class name is before _IMG in the filename)

  • INPUTS:
    • img_path [string]: filepath to a jpeg image
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • image [tensor array]
    • class_label [tensor int]

plot_funcs.py


conf_mat_filesamples
conf_mat_filesamples(model, knn, sample_filenames, num_classes, num_dim_use, CLASSES)

This function computes a confusion matrix (matrix of correspondences between true and estimated classes) using the sklearn function of the same name. Then normalizes by column totals, and makes a heatmap plot of the matrix saving out to the provided filename, cm_filename

  • INPUTS:
    • model [keras model]
    • knn [sklearn knn model]
    • sample_filenames [list] of strings
    • num_classes [int]
    • num_dim_use [int]
    • CLASSES [list] of strings: class names
  • OPTIONAL INPUTS: None
  • GLOBAL INPUTS: None
  • OUTPUTS:
    • cm [ndarray]: confusion matrix

p_confmat
p_confmat(labs, preds, cm_filename, CLASSES, thres = 0.1)

This function computes a confusion matrix (matrix of correspondences between true and estimated classes) using the sklearn function of the same name. Then normalizes by column totals, and makes a heatmap plot of the matrix saving out to the provided filename, cm_filename

  • INPUTS:
    • labs [ndarray]: 1d vector of labels
    • preds [ndarray]: 1d vector of model predicted labels
    • cm_filename [string]: filename to write the figure to
    • CLASSES [list] of strings: class names
  • OPTIONAL INPUTS:
    • thres [float]: threshold controlling what values are displayed
  • GLOBAL INPUTS: None
  • OUTPUTS: None (figure printed to file)
← Models
  • 1_ImageRecog
    • General workflow using your own data
    • model_funcs.py
    • tfrecords_funcs.py
    • plot_funcs.py
  • 2_ObjRecog
    • General workflow using your own data
    • model_funcs.py
    • data_funcs.py
    • tfrecords_funcs.py
    • plot_funcs.py
  • 3_ImageSeg
    • General workflow using your own data
    • model_funcs.py
    • tfrecords_funcs.py
    • plot_funcs.py
  • 4_UnsupImageRecog
    • General workflow using your own data
    • model_funcs.py
    • tfrecords_funcs.py
    • plot_funcs.py
"ML Mondays"
Internal links
DocsDataHelp
Community
Stack OverflowUSGS Community for Data Integration (CDI)USGS Remote Sensing Coastal Change Projectwww.danielbuscombe.com
More
BlogGitHubStar
Follow @magic_walnut
Marda Science
Copyright © 2020 Marda Science, LLC