I’ve noticed some files I opened in a text editor have all kinds of crazy unrenderable chars

  • palebluethought@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    2 months ago

    can it? Sure, most any arrangement of bits can be converted into some kind of Unicode text. Can it be converted to something meaningful or readable? No, some formats are plain text (.txt, .ini, .json, .html for some random examples) that are meant to be read by humans, and others are binary formats that are only meaningful when decoded by a computer into specific data structures inside a piece of software.

  • Nibodhika@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    1 month ago

    At the end of the day data is just binary, i.e. it’s composed of 0 and 1. What those 0 and 1 represent is mostly irrelevant to this discussion. The short version is that 01000001 can mean A or it can mean that a given pixel is 65/256 red, or that the speaker should vibrate in a specific frequency, etc, etc.

    So what happens when you open a file that’s not text in a text editor? Well, some of the 0 and 1 make up gibberish, or characters that are not meant to be printed. Fun fact, you should be able do this the other way around too, i.e. open a text as an image, but again it will be gibberish, and most likely would not load since images have lots of information that relate to size, compression, etc, that if incorrect the program won’t know what to do, but because text can always be valid it will always work, although sometimes your editor might show weird thing in the places where there’s a non-printable character.

    • cheese_greater@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      2 months ago

      Can you comment on the specific makeup of a “rendered” audio file in plaintext, how is the computer representing every little noise bit of sound at any given point, the polyphony etc?

      What are the conventions of such representation? How can a spectrogram tell pitches are where they are, how is the computer representing that?

      Is it the same to view plaintext as analysing it with a hex-viewer?

      • AbouBenAdhem@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 months ago

        Most binary-to-text encodings don’t attempt to make the text human-readable—they’re just intended to transmit the data over a text-only medium to a recipient who will decode it back to the original binary format.

        • cheese_greater@lemmy.worldOP
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          2 months ago

          I do understand I’m not able to read it myself, I’m more curious about the architecture of how that data is represented and stored and conceptually how such representation is practically organized/reified…

          • AbouBenAdhem@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            2 months ago

            The original binary format is split into six-bit chunks (e.g., 100101), which in decimal format correspond to the integers from 0 to 63. These are just mapped to letters in order:

            1. 000000 = A,
            2. 000001 = B,
            3. 000010 = C,
            4. 000011 = D,

            etc.—it goes through the capital letters first, then lower-case letters, then digits, then “+” and “/”. It’s so simple you could do it by hand from the above description, if you were looking at the data in binary format.

          • intensely_human@lemm.ee
            link
            fedilink
            arrow-up
            2
            ·
            1 month ago

            One representation of a sound wave is a sequence of amplitudes, expressed as binary values. Each sequential chunk of N bits is a number, and that number represents the amplitude of the sound signal at a moment in time. Those moments in time are spaced at equal intervals. One common sampling rate is 44.1 kHz.

            That number is chosen because of the nyquist-shannon sampling rate theorem, in combination with the fact that humans tend to be able to hear sounds up to 20 kHz.

            The sampling rate theorem says that if you want to reproduce a signal containing information at up to X frequency, you need to sample it at 2X frequency.

            To learn more about this topic, look for texts, classes, or videos on “signal processing”. It’s often taught in classes that also cover electronic circuits.

            Here is an example of such a text

            That’s pretty dense reading, but if you’re willing to stop and learn any math you encounter while reading it, it will probably blow your mind into a whole new level of understanding the world.

            • cheese_greater@lemmy.worldOP
              link
              fedilink
              arrow-up
              2
              ·
              1 month ago

              I honestly wish I had gotten into all the science and physics of signal processing, taken calculus etc, I feel like I’ll pick up a lot of the more qualitative stuff over time particularly if I’m able to apply it in building certain apps that do some novel manipulations and obviously some of that will require me to get an operational understanding of how to put all these blocks together.

              • intensely_human@lemm.ee
                link
                fedilink
                arrow-up
                2
                ·
                1 month ago

                You still can. Worst case, you spend $80 now and then on a textbook. There’s no reason you can’t buy a physics or calculus textbook and just start reading it. Costs about the same as an expensive dinner for two.

                Best case, you just learn it for free or for the cost of a Khan Academy membership.

                You’re not limited to surface level understanding. You can develop as deep an understanding of any topic as anyone else. In fact, I would wager an adult who knows how to work can probably learn math and physics at a much deeper level than a college engineering student, if only because they can take their time and absorb everything fully.

                Sounds like you might be a coder. Consider the level of code quality people achieve in hobby projects: often much better than in a professional setting because in the pro setting there’s always a time and budget constraint. In a hobby project, one can polish and polish and take all the time they want.

                It’s never too late to give yourself a solid science education.

    • cheese_greater@lemmy.worldOP
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      2 months ago

      I just mean like any file (pdf, jpeg, mp4, mp3, exe—

      mp4/mp3 most famously for me

      I find it so damn cool and incredible I can record something/anything right now and open the audio in a text file and its all right there—albeit in an incomprehensible format but there altogether.

      Its like a thinking rock etching sound into stone