Image processors #196

jlibovicky · 2016-12-13T16:39:18Z

This PR contains:

image reader that reads a list of image files and ensures they have the same size (which replaces old image_utils.py)
simplifies the code of the CNN encoder, such that there only the convolutions (it requires more work anyway)
adds another test case with text recognition with CNN -> RNN deocder architure
the test case includes sample data with (40 images)

jindrahelcl · 2016-12-13T16:54:41Z

tests/data/str/train_words.txt

+TBS
+guises
+poppadoms
+SAVAGENESS


SAVAGENESS!!!! :-D

jindrahelcl · 2016-12-13T16:55:59Z

neuralmonkey/encoders/cnn_encoder.py

@@ -172,50 +155,8 @@ def __init__(self, data_id, convolutions, rnn_layers,
                            last_layer, keep_prob=self.dropout_placeholder)

                last_layer = last_layer * last_padding_masks
-            last_layer_size = last_n_channels * image_height * image_width


Co dělal tenhle kód? (Ne že by to bylo důležitý..)

Byla za tim cnn ještě optional bidi rnn vrstva a tohle číslo se používalo k reshapování. Tu RNN jsem zahodil, protože mi to přišlo poněkud monster.

jindrahelcl · 2016-12-13T16:59:17Z

neuralmonkey/evaluators/edit_distance.py

+
+
+# pylint: disable=invalid-name
+EditDistance = _EditDistance()


To asi neni moc chytrý, když budeš chtít změnit jméno (jeden z mála důvodů, proč to je class :-) ) tak budeš muset do konfiguráku dávat podtržítkovou třídu..
Můžeš třídu přejmenovat na EditDistanceEvaluator nebo tak nějak a helper tam nechat.

jindrahelcl · 2016-12-13T17:03:46Z

neuralmonkey/encoders/cnn_encoder.py

-
-            self.encoded = encoder_state
+
+            self.encoded = tf.reduce_mean(last_layer, [2, 3])


Tady by pomoh koment řikající jaký má last_layer dimenze

Ten komentář pomoh hlavně mě, protože to průměruju přes blbý dimenze samozřejmě :-D

jindrahelcl · 2016-12-13T17:07:21Z

neuralmonkey/processors/helpers.py

@@ -17,3 +17,14 @@ def untruecase(
            yield [sentence[0].capitalize()] + sentence[1:]
        else:
            yield []
+
+
+def pipeline(processors: List[Callable]) -> Callable:


Takže v datasetu dáš jako preprocess <něco> a v [něco] dáš class=pipeline a jako processors tomu dáš seznam preprocess objektů?

Nebylo by lepší to nějak vyřešit přímo v tom datasetu? Jakože teď je to nějaká n-tice nebo co, tak přidat možnost do tý entice dát místo jednoho procesoru seznam?

To je dobrej nápad, ale možná bych tam nechal oba způsoby. V tom INI file nejde zalamovat řády a napsat pak do jednoho seznamu víc vícenásobných preprocesorů, tak by se to mohlo dost znepřehlednit.

jindrahelcl · 2016-12-13T17:08:26Z

neuralmonkey/readers/image_reader.py

+from typing import Callable, List, Optional
+import os
+import numpy as np
+from scipy import ndimage


máme scipy v requirements? máme to tam přidat? máme to jen zmínit v dokumentaci? (podobně jako bychom to měli udělat u nltk bleu evaluátora, až/jestli se přimerǧne z dekoderiho refaktoru)

Scipy je docela velká věc a na překlad není potřeba. Tak bych napsal do dokumentace, že kdo chce dělat image processing, musí mít scipy.

NLTK bych tam asi nedával, protože NLTK je GPL a pak by NeMo taky musela být GPL.

jindrahelcl · 2016-12-13T17:10:15Z

neuralmonkey/readers/image_reader.py

+                    if len(image.shape) == 2:
+                        channels = 1
+                        image = np.expand_dims(image, 2)
+                    else:


čistě z bezpečnostních důvodů: elif len(image.shape) == 3 nebo >2 a do else pak narvat nějakou bad file format error exception

jindrahelcl · 2016-12-13T17:11:27Z

neuralmonkey/readers/image_reader.py

+                 pad_w: Optional[int]=None,
+                 pad_h: Optional[int]=None,
+                 rescale: bool=False,
+                 mode: str='RGB') -> Callable:


docstring, nebo ho nechat za dalšího panáka Dušanovi :-)

jindrahelcl · 2016-12-13T17:13:10Z

neuralmonkey/readers/image_reader.py

+
+
+def _rescale(image, pad_w, pad_h):
+    orig_w, orig_h = image.shape[:2]


tyhle tři funkce jsou ukázkoví kandidáti na unit testy.. :-) jestli víceméně souhlasíš s #197, tak založ test issue

jindrahelcl · 2016-12-15T21:26:23Z

neuralmonkey/encoders/cnn_encoder.py

+            # we average out by the image size -> shape is number
+            # channels from the last convolution
+            self.encoded = tf.reduce_mean(last_layer, [1, 2])
+            # TODO assert shape


:D
Nechceš tam ten assert přidat?

Začal jsem si v jiný větvi dělat chytrý assertkovátko shapů, protože jsem koukal, že se to dělá často a furt se opakuje ten samej kód. A často jsou taky shapy v komentářích. To všechno přepíšu na asserty (když to jde pustit, je to nejlepší možná dokumentace).

jindrahelcl · 2016-12-20T16:30:27Z

Travis má blbej setup:
raise ImportError("Could not import the Python Imaging Library (PIL)

tomasmcz

In my opinion scipy should either be a dependency, or this code should be clearly separated from the rest of the package.

jindrahelcl · 2017-01-09T09:31:06Z

As I mentioned in #214 - can we move image processing to the TF graph?

jlibovicky · 2017-01-09T20:29:32Z

Yes, we can.

jlibovicky · 2017-01-16T10:01:06Z

I investigated the Tensor flow image ops and don't think we want to use them. It is not capable of loading a batch of images, there must be one operation per image (i.e., we would need to know the batch size in advance). Having a reader like this allows easier cropping and reshaping the images to the same size if we don't know the image shape in advance.

I suggest merging the PR as is. @tomasmcz, @jindrahelcl, what do you say?

jlibovicky requested a review from jindrahelcl December 13, 2016 16:39

jindrahelcl requested changes Dec 13, 2016

View reviewed changes

jlibovicky force-pushed the image_processors branch from ad47280 to c62c3c2 Compare December 15, 2016 14:04

jindrahelcl reviewed Dec 15, 2016

View reviewed changes

jlibovicky force-pushed the image_processors branch 2 times, most recently from d74d84e to 282f135 Compare December 16, 2016 17:14

jindrahelcl closed this Dec 20, 2016

jindrahelcl reopened this Dec 20, 2016

jindrahelcl approved these changes Dec 20, 2016

View reviewed changes

tomasmcz requested changes Dec 20, 2016

View reviewed changes

jlibovicky force-pushed the image_processors branch from 8738788 to ec9d914 Compare December 22, 2016 13:57

jlibovicky force-pushed the image_processors branch from 19bf984 to 21ec2e4 Compare January 4, 2017 13:11

jlibovicky mentioned this pull request Jan 12, 2017

Coders base class #241

Merged

jlibovicky self-assigned this Jan 12, 2017

jlibovicky added the feature label Jan 13, 2017

jlibovicky added 12 commits January 16, 2017 10:26

pipeline pre/post-processors

13f8210

stub of an image preprocessor

7f4a956

first version of mage reader

f244a2e

sample data for str

eea57be

simplify cnn encoder

bec1779

direct callable for edit distance

3d25fb4

simple test case for cnn -> decoder architecture

9ad1b13

delete depreacated image utils

aed419e

fix channels averaging

a3ace93

addressing review

d879e9d

style and refactor cnn encoder

57c031a

fix str test case

9de5bae

jlibovicky added 4 commits January 16, 2017 10:26

scipy to test requirements

681e11c

change requirements to pillow

4e2062f

image reader using pillow

98027cb

pycodestyle

0a69248

jlibovicky force-pushed the image_processors branch from 7ba3afc to 0a69248 Compare January 16, 2017 09:27

jlibovicky added 2 commits January 16, 2017 10:50

parenthesis in str test ini

66557a3

assert shapes in cnn encoder

f209643

tomasmcz approved these changes Jan 16, 2017

View reviewed changes

jlibovicky merged commit 0b1d4c5 into master Jan 16, 2017

jlibovicky deleted the image_processors branch January 16, 2017 11:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image processors #196

Image processors #196

jlibovicky commented Dec 13, 2016

jindrahelcl Dec 13, 2016

jindrahelcl Dec 13, 2016

jlibovicky Dec 15, 2016

jindrahelcl Dec 13, 2016

jindrahelcl Dec 13, 2016

jlibovicky Dec 15, 2016

jindrahelcl Dec 13, 2016

jlibovicky Dec 15, 2016

jindrahelcl Dec 13, 2016

jlibovicky Dec 15, 2016

jindrahelcl Dec 13, 2016

jindrahelcl Dec 13, 2016

jindrahelcl Dec 13, 2016

jindrahelcl Dec 15, 2016

jlibovicky Dec 16, 2016

jindrahelcl Dec 16, 2016

jindrahelcl commented Dec 20, 2016

tomasmcz left a comment

jindrahelcl commented Jan 9, 2017

jlibovicky commented Jan 9, 2017

jlibovicky commented Jan 16, 2017



		# pylint: disable=invalid-name
		EditDistance = _EditDistance()


		self.encoded = encoder_state

		self.encoded = tf.reduce_mean(last_layer, [2, 3])



		def _rescale(image, pad_w, pad_h):
		orig_w, orig_h = image.shape[:2]

Image processors #196

Image processors #196

Conversation

jlibovicky commented Dec 13, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jindrahelcl commented Dec 20, 2016

tomasmcz left a comment

Choose a reason for hiding this comment

jindrahelcl commented Jan 9, 2017

jlibovicky commented Jan 9, 2017

jlibovicky commented Jan 16, 2017