Skip to content
View mbanon's full-sized avatar
shining
shining

Organizations

@paracrawl @bitextor @macocu

Block or report mbanon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. bitextor/bicleaner bitextor/bicleaner Public

    Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

    Python 152 22

  2. bitextor/bifixer bitextor/bifixer Public

    Tool to fix bitexts and tag near-duplicates for removal

    Python 29 3

  3. paracrawl/corset paracrawl/corset Public

    Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.

    SCSS 17 3

  4. paracrawl/keops paracrawl/keops Public

    Tool for manual evaluation of parallel sentences.

    PHP 14 4

  5. fastspell fastspell Public

    Targetted language identifier, based on FastText and Hunspell.

    Python 30 4

  6. hplt-project/data-analytics-tool hplt-project/data-analytics-tool Public

    Data Analytics Tool

    JavaScript 10 1