I can’t believe nobody has done this list yet. I mean, there is one about names, one about time and many others on other topics, but not one about languages yet (except one honorable mention that comes close). So, here’s my attempt to list all the misconceptions and prejudices I’ve come across in the course of my long and illustrious career in software localisation and language technology. Enjoy – and send me your own ones!

  • 2xsaiko@discuss.tchncs.de
    link
    fedilink
    arrow-up
    8
    ·
    16 hours ago

    Segmenting a text into sentences is as easy as splitting on end-of-sentence punctuation.

    Is there a language this actually isn’t true for? It seems oddly specific like a lot of the others and I don’t think I know of one that does this. Except maybe some wack ass conlangs of course.

    • Giooschi@lemmy.world
      link
      fedilink
      English
      arrow-up
      15
      ·
      edit-2
      13 hours ago

      Even in english this isn’t true, for example dots can appear inside a sentence for multiple reasons (a decimal number, an abbreviation, a quotation, three dots, etc, etc), which would make you split it into more than one piece.