Skip to content

2017 05 29 Second meeting: got first images, trained first networks, but still needs a lot of improvements

Arthur Silber edited this page May 29, 2017 · 5 revisions

What has been done

Arthur:

  • (click here for example images)
  • Flickr API downloader: got 500 pics each for reichstag, brandenburger tor, dom and rathaus
  • segmentation with ademxapp: works quite fine, recognizes "bulding" pixels
  • Postprocessing of these labels with erode/dilate, BB-generation based on the result
  • Trained ssd with 4 classes, not overly great result (probably: bad image training set quality / bb quality)
  • Cityscapes Dataset: offers segmentation, has same problems as flickr (1) multiple buildings as one blob (2) problematic point of view / cropped buildings (== same or less quality as the existing flickr approach)

Discussion/Tips:

  • improve BBs:
    • look at height distances (find minima), throw out strange (unconnected/small components)
    • crop images: fill up with black pixels
    • ggf trainingsdaten je nach "motiv-blickwinkel" sortieren
    • 1000 examples / class fürs training == human level accuracy --> use more

ToDo:

  • Gitlab organization

Adrian:

  • labelme datensatz downloaded
  • extracted BBs out of polygons (quality of existing polygons is not great)
  • training SSD failed to NaN errors

Discussion/Tips:

  • NaN values during training? -> reduce learning rate

Fabian:

  • computer vision toolbox for BB improvements?
  • phd thesis: not that usable for us (different focus)

Torben:

  • fetched flickr datase
  • tried day/night segmentation based on pixel values (did that work out?)

Tasks

  • Allgemein: trainingsdaten auf server speichern
  • Adrian: set up on server: Virtual env python --> virtualenvwrapper
  • Adrian + Fabian: SSD auf buildings / non buildings trainieren (goal: BB generation for arbitrary buildings)
  • Martin: 7k images with non-buildings to help Adrian's training
  • Arthur + Torben: get more flickr data (Arthur: Brandenburger Tor, Reichstag, Rotes Rathaus, Dom; Torben: Gedächtniskirche, Fernsehturm)
  • A+T: postprocessing of flickr data (goal: high quality training set for SSD)
  • A+T: train SSD with flickr data

NUMPY: WICHTIG: row layer, dann column layer: zuerst kommt die höhe, dann die breite eines bildes (zur verwendung im netzwerk)

Talk in 2 weeks

15 minutes

  1. Übersucht: was wollen wir machen
  2. related work vorstellen
  3. mögliche lösungsansätze, wie wollen wir es machen
  4. Mileston plan