-
Notifications
You must be signed in to change notification settings - Fork 1
2017 05 29 Second meeting: got first images, trained first networks, but still needs a lot of improvements
Arthur Silber edited this page May 29, 2017
·
5 revisions
Arthur:
- (click here for example images)
- Flickr API downloader: got 500 pics each for reichstag, brandenburger tor, dom and rathaus
- segmentation with ademxapp: works quite fine, recognizes "bulding" pixels
- Postprocessing of these labels with erode/dilate, BB-generation based on the result
- Trained ssd with 4 classes, not overly great result (probably: bad image training set quality / bb quality)
- Cityscapes Dataset: offers segmentation, has same problems as flickr (1) multiple buildings as one blob (2) problematic point of view / cropped buildings (== same or less quality as the existing flickr approach)
Discussion/Tips:
- improve BBs:
- look at height distances (find minima), throw out strange (unconnected/small components)
- crop images: fill up with black pixels
- ggf trainingsdaten je nach "motiv-blickwinkel" sortieren
- 1000 examples / class fürs training == human level accuracy --> use more
ToDo:
- Gitlab organization
Adrian:
- labelme datensatz downloaded
- extracted BBs out of polygons (quality of existing polygons is not great)
- training SSD failed to NaN errors
Discussion/Tips:
- NaN values during training? -> reduce learning rate
Fabian:
- computer vision toolbox for BB improvements?
- phd thesis: not that usable for us (different focus)
Torben:
- fetched flickr datase
- tried day/night segmentation based on pixel values (did that work out?)
- Allgemein: trainingsdaten auf server speichern
- Adrian: set up on server: Virtual env python --> virtualenvwrapper
- Adrian + Fabian: SSD auf buildings / non buildings trainieren (goal: BB generation for arbitrary buildings)
- Martin: 7k images with non-buildings to help Adrian's training
- Arthur + Torben: get more flickr data (Arthur: Brandenburger Tor, Reichstag, Rotes Rathaus, Dom; Torben: Gedächtniskirche, Fernsehturm)
- A+T: postprocessing of flickr data (goal: high quality training set for SSD)
- A+T: train SSD with flickr data
NUMPY: WICHTIG: row layer, dann column layer: zuerst kommt die höhe, dann die breite eines bildes (zur verwendung im netzwerk)
15 minutes
- Übersucht: was wollen wir machen
- related work vorstellen
- mögliche lösungsansätze, wie wollen wir es machen
- Mileston plan