Checking where next with our “super-awesome” 42GB training set, I’ve realized that for a next step we’ll need much lower resolution. We are now on 1920 x 1080, while many of those traditional neuron networks like ResNet are happily coping with much more modest 224 x 224.
So well onto some tiny Python scripting to get our set converted to 224 x 224 – grayscale! Following script does that magic in a matter of 90 minutes.
#!/usr/bin/env python3
from argparse import ArgumentParser
from PIL import Image, ImageOps
import os
from glob import glob
def transform(im, args):
left = im.width/2-args.width/2*args.scale
top = im.height/2-args.height/2*args.scale
right = im.width/2+args.width/2*args.scale
bottom = im.height/2+args.height/2*args.scale
return left, top, right, bottom
def crop(args):
result = [y for x in os.walk(args.input) for y in glob(os.path.join(x[0], '*.png'))]
counter = 0
for filename in result:
with Image.open(filename) as im:
im2 = im.crop(transform(im, args))
im2 = im2.resize((args.width, args.height))
if args.grayscale:
im2 = ImageOps.grayscale(im2)
filename_out = filename.replace(args.input, args.output)
os.makedirs(os.path.dirname(filename_out), exist_ok=True)
im2.save(filename_out)
counter += 1
pct = round(counter/len(result)*100,4);
print("Finished processing for", filename_out, "\t[", pct, "%]")
def main():
parser = ArgumentParser(description='Recursive crop & transformation for image files')
parser.add_argument('--input', default='./train', help='input folder')
parser.add_argument('--output', default='./train_224_224_monochrome', help='output folder')
parser.add_argument('--height', default=224, help='image height')
parser.add_argument('--width', default=224, help='image width')
parser.add_argument('--grayscale', default=True, help='convert output to grayscale')
parser.add_argument('--scale', default=2, help='scale down coefficient')
args = parser.parse_args()
print('input folder: ', args.input)
print('output folder: ', args.output)
if args:
crop(args)
else:
parser.print_help()
print("Done")
main()
This whole operation worked out reducing our initial 48.5GB monster train set to much more convenient 1.4GB.


Result seems to be bit radical, but well it is like it is.


Meanwhile Sebi kept reading and experimenting with machine learning and got some fantastic results there, but that’s for another post. 🙂
One thought on “Training set with Stellarium II”