I noticed that the Monero chain compresses about 60%, would it be possible to compress blocks before sending them from a remote node to a syncing wallet thus saving a big chunk of bandwidth and time?

Does anyone know if this is already happening during sync, or if not why?

edit: this can be done using ssh tunnels, if you have ssh access to your remote server. The “-C” option enables compression.

ssh -C -NL 18089:localhost:18089 server_username@server_address

Now you can set your wallet to 127.0.0.1:18089 and now your wallet syncing should be faster, enjoy!

  • mister_monster
    link
    fedilink
    English
    arrow-up
    6
    ·
    7 months ago

    So you’ve got 2 components to sync time, bandwidth and processing. In Monero we already have to attempt to decrypt transactions in each block to see if it’s ours. This is what really takes time with regard to syncing.

    If you compressed blocks, you’d save some bandwidth, but you’d take time client side to uncompress before sync. This adds to sync time. A user with high processing power using a node with low bandwidth might see a benefit, but for most people the bottleneck isn’t bandwidth, it’s processing power. Most people wouldn’t see a sync time improvement with your proposed scheme.

    • tuskerOP
      link
      fedilink
      arrow-up
      1
      ·
      7 months ago

      Decompression is a very fast operation, there are many locations where bandwidth is 1mbit/s and maxes out at 10mbit/s, not to mention bandwidth is also metered. With blocks now 3x the size from what they were a month ago it would be a significant improvement in terms of speed and cost. Blocks will only get bigger going forward.

      • mister_monster
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        7 months ago

        The amount of operations per second required to decompress depends on the compression protocol, how compressed something is, so it can be fast or slow, also more importantly, the relationship between compute required to decompress and the amount decompressed is not linear, that is, 10% more compression does not translate to 10% more computation to decompress, it takes more than that. So at some point you’re taking more time to decompress than you saved downloading due to your bandwidth constraint. This is different for every node (or more accurately, every pair of nodes, sinceax bandwidth is the lowest of the two communicating) and so the more compression you use, the more you favor low bandwidth, high power nodes. I don’t know what the median or mean processing power is for nodes, and I don’t know what the median or mean bandwidth is, I’m sure some compression would benefit the network overall, but you’re always benefitting some nodes at the expense of others in doing it, and there’s no optimal scheme for all nodes on the network. Also this optimum is ever changing as people upgrade hardware and connections.

        It might make sense to allow nodes to request compressed blocks from each other in the RPC, like a field in the request that says “send compressed blocks” so that high power, low bandwidth nodes can ask for it, but compression also has a processing requirement and the node being asked might not want to do it. It could cache compressed blocks, since blocks don’t change, but then it has to decompress compressed blocks every time it has to access them, or store a compressed and uncompressed version of each block if it needs constant access but wants to send compressed blocks. Its trade offs all the way down. There are considerations that can be made. But is it worth it? I don’t know. Also consider that adding a field to the request can be used for fingerprinting, the more granular you make RPC requests, the more data points can be used to fingerprint the node, which is a problem over Tor or i2p.

        • tuskerOP
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          7 months ago

          There are established compression standards which should avoid all of the issues you mention. Obviously we would not compress to the point where it takes longer to decompress than to download over a 1mbit/s connection or cause data loss.

          Most software distributed over the internet is compressed despite all the “unknowns” being present. Data stream compression is likewise beneficial and established when transferring large amounts of data to remote locations, such as backups.

          Let us not get caught up in analysis paralysis and instead stick to practical solutions that will benefit the majority of users.

    • SummerBreeze
      link
      fedilink
      arrow-up
      1
      ·
      7 months ago

      How come the fees and which node (public or not) made such a difference if the issue was processing power? The bottleneck?

      • mister_monster
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        7 months ago

        Well so the bandwidth of the remote node is a potential bottleneck, as well as the bandwidth of the person syncing. Whichever is smaller is going to be max rate at which data is sent, ignoring the connection path of course for simplicity. It can affect the speed of sync significantly. If you’ve got a powerful computer that can do a ton of operations per second and check a ton of blocks for transactions, your bottleneck is going to be bandwidth. But, if we decide to compress the blocks as you get them, you can alleviate that, with the cost of decompressing the blocks and so slowing your processing of them. Compression is an NP problem, so the more you compress the blocks, the longer it takes to decompress the data, and this relationship is not linear; 10% more compression requires more than 10% more processing time to decompress. Compressing too much eats up that bandwidth benefit you’re going to get and there’s a point of equilibrium that’s different for each node on the network, based on it’s bandwidth and processing power. Obviously, we cannot compress differently based on each node, so compressing necessarily is a trade off between bandwidth and hardware capability, any compression favors low bandwidth, higher power nodes, and no compression favors higher bandwidth, lower power nodes. Further, your compression scheme cannot compress beyond certain limits without becoming lossy, so there’s a practical limit even ignoring processing time. You also have to consider processing power of the remote node, since it has to compress blocks.