.   +---------------------+      
+---|---------------------|-------.    .---_||||||_-----------------.   
|:--|-----\\\\\//////-----|--. o   ----                          (\)| 
|#|||------         ------|--.||      [  Peter A. H. Peterson       | 
|-::|-----|.-,   ,-.|-----|===========[  Assistant Professor of CS  |  
|o+o|----(| @ |   @ |)----|:::::::::::[  329 Heller Hall            | 
||o||-----#  '-'    #-----|:::::::::::[  1114 Kirby Drive           | 
|||||------\"==="  /------|===========[  Duluth, MN 55812           | 
|||||-------`-----'|------| | o       [  pahp@d.umn.edu             | 
||/ |.----/'\'-----'/\---.|-+-'       [  pedro@tastytronic.net_  (/)|   
||^_|     '''|     |''    |-----'''----||||-||||||||||-||||||||||.  |  
||| +---------------------+-----------'                          | :| 
|:|                                                              |:.|
| '------------.  [back to research]                             |.:|
| home         |                                                 |:.| 
'------------. |  Backwards-Compatible Fine-Grain Mixing         |.:|
.------------' |  Michael Gray, Peter Peterson, and Peter Reiher |:.|      
| cv           |                                                 |.:|      
| .------------'  [full paper]                                   |:.|      
| '------------.                                                 |.:|      
| research     |  For network transmission, the benefit of data  |:.|      
'------------. |  compression is often outweighed by its cost.   |.:|      
.------------' |  While compression can dramatically increase    |:.|      
| interests    |  throughput under ideal circumstances, higher   |.:|      
| .------------'  than expected bandwidth or lower than          |:.|      
| '------------.  expected CPU availability can make             |.:|      
| blog         |  compression a bottleneck rather than an        |:.|      
'------------. |  optimization. As a result, it is often         |.:|      
.------------' |  confined to systems with predictable and       |:.|      
| reading      |  stable workloads, or like in OpenSSH, is       |.:|      
| .------------'  off-by-default and must be enabled manually.   |:.|      
| '------------.                                                 |.:|      
| quotes       |  Researchers have attempted to improve this     |:.|      
'------------. |  situation with "adaptive compression"          |.:|      
.------------' |  techniques designed to make ideal compression  |:.|      
| contact      |  decisions dynamically in the face of varying   |.:|      
|              |  conditions. Unfortunately, the strength        |:.|      
|''''''''''''''|  parameters of typical compressors  provide     |.:|      
| 001010101101 |  far from linear scaling. In reality,           |:.|      
| 011011011001 |  increasing from minimum to maximum strength    |.:|      
| 010011011011 |  with common compressors such  as DEFLATE       |:.|      
| 110011010111 |  offers only a small increase in data           |.:|      
| 100100001100 |  reduction while incurring a large CPU          |:.|      
| 101001100111 |  penalty. Selecting  among compression          |.:|      
| 001101100101 |  algorithms can provide a broader range of      |:.|      
| 001111000000 |  possibilities, but still fails to provide a    |.:|      
| 001010101101 |  real spectrum of compression ratio and CPU     |:.|      
| 011011011001 |  costs between those algorithms.                |.:|      
| 010011011011 |                                                 |:.|      
| 110011010111 |  Additionally, there is usually a significant   |.:|      
| 100100001100 |  difference (in terms of compression ratio and  |:.|      
| 101001100111 |  the CPU time required) between "no             |.:|      
| 001101100101 |  compression" and the "cheapest" option,        |:.|      
| 001111000000 |  regardless of the algorithm chosen.  Because   |.:|      
| 001010101101 |  of this large performance gap, there can be    |:.|      
| 011011011001 |  significant underutilization -- even when the  |.:|      
| 010011011011 |  "best" strategy is chosen.  If no compression  |:.|      
| 110011010111 |  is the best choice, the CPU will be left       |.:|      
| 100100001100 |  idle. If compressing is the best choice,       |:.|      
| 101001100111 |  "overcompression" may reduce data              |.:|      
| 001101100101 |  unnecessarily, wasting CPU time and            |:.|      
| 001111000000 |  underutilizing bandwidth.                      |.:|      
| 001010101101 |                                                 |:.|      
| 011011011001 |  Pu and Singaravelu identified this problem     |.:|      
| 010011011011 |  and introduced the concept of Fine-Grain       |:.|      
| 110011010111 |  Mixing (FG-Mixing) for adaptive compression    |.:|      
| 100100001100 |  in dynamically varying networks. Rather than   |:.|      
| 101001100111 |  trying to pick the best single strategy        |.:|      
| 001101100101 |  (potentially underusing some resource),        |:.|      
| 001111000000 |  FG-Mixing compressed a portion of the data in  |.:|      
| 001010101101 |  question with an algorithm, sending the rest   |:.|      
| 011011011001 |  uncompressed. As available CPU increased       |.:|      
| 010011011011 |  (more cycles for compression), or network      |:.|      
| 110011010111 |  bandwidth decreased (relatively more time for  |.:|      
| 100100001100 |  compression), a greater proportion of blocks   |:.|      
| 101001100111 |  in the outgoing buffer could be compressed,    |.:|      
| 001101100101 |  improving effective throughput. Similarly,     |:.|      
| 001111000000 |  when available CPU decreased (less cycles for  |.:|      
| 001010101101 |  compression), or network bandwidth increased   |:.|      
| 011011011001 |  (relatively less time for compression), a      |.:|      
| 010011011011 |  greater proportion of the blocks would simply  |:.|      
| 110011010111 |  be transmitted uncompressed.                   |.:|      
| 100100001100 |                                                 |:.|      
| 101001100111 |  Pu and Singaravelu's solution was able to      |.:|      
| 001101100101 |  maximize both CPU and bandwidth usage,         |:.|      
| 001111000000 |  successfully scaling from the network          |.:|      
| 001010101101 |  constrained case where compressing all blocks  |:.|      
| 011011011001 |  improved throughput, up to 400 megabits per    |.:|      
| 010011011011 |  second where the majority of blocks were sent  |:.|      
| 110011010111 |  uncompressed. Their implementation             |.:|      
| 100100001100 |  significantly outperformed a model simple      |:.|      
| 101001100111 |  switching approach, both in terms of resource  |.:|      
| 001101100101 |  utilization and throughput. They also          |:.|      
| 001111000000 |  demonstrated that the concept of FG-Mixing     |.:|      
| 001010101101 |  generalizes to many algorithms by performing   |:.|      
| 011011011001 |  mixing experiments with a selection of         |.:|      
| 010011011011 |  compressors, including the GNU Gzip            |:.|      
| 110011010111 |  implementation of deflate and bzip2.           |.:|      
| 100100001100 |                                                 |:.|      
| 101001100111 |  We explore a different view of FG-Mixing and   |.:|      
| 001101100101 |  give another explanation for its flexibility   |:.|      
| 001111000000 |  and performance: that -- unlike legacy         |.:|      
| 001010101101 |  strength options -- fine-grain mixing of       |:.|      
| 011011011001 |  compressed and uncompressed blocks gives       |.:|      
| 010011011011 |  inflexible legacy compressors a continuous     |:.|      
| 110011010111 |  and responsive "knob" for scaling compression  |.:|      
| 100100001100 |  linearly with respect to CPU usage.            |:.|      
| 101001100111 |  Additionally, this reliable control allows     |.:|      
| 001101100101 |  bandwidth use to be maximized                  |:.|      
| 001111000000 |  opportunistically -- increasing when it is     |.:|      
| 001010101101 |  profitable, decreasing when it is not --       |:.|      
| 011011011001 |  rather than relying on predictive models.      |.:|      
| 010011011011 |  This alternative view suggests a variety of    |:.|      
| 110011010111 |  significant efficiency improvements to the     |.:|      
| 100100001100 |  original work including better compression     |:.|      
| 101001100111 |  ratios for a given level of CPU consumption,   |.:|      
| 001101100101 |  a wider range of data reduction and CPU cost   |:.|      
| 001111000000 |  options, and parallelization to take           |.:|      
| 001010101101 |  advantage of multi-core CPUs.                  |:.|      
| 011011011001 |                                                 |.:|      
| 010011011011 |  Additionally, Pu and Singaravelu's FG-Mixing   |:.|      
| 110011010111 |  implementation used a custom protocol for      |.:|      
| 100100001100 |  transmitting the mixture of compressed and     |:.|      
| 101001100111 |  uncompressed data, requiring non-standard      |.:|      
| 001101100101 |  decompressors. Instead, we show that           |:.|      
| 001111000000 |  fine-grain mixed compression can be            |.:|      
| 001010101101 |  implemented without performance degradation    |:.|      
| 011011011001 |  while remaining fully backwards-compatible     |.:|      
| 010011011011 |  with the ubiquitous deflate specification.     |:.|      
| 110011010111 |  This means that off-the-shelf systems          |.:|      
| 100100001100 |  employing deflate-based compression could      |:.|      
| 101001100111 |  benefit from fine-grain mixing without         |.:|      
| 001101100101 |  significant re-engineering or broken           |:.|      
| 001111000000 |  compatibility. This virtually eliminates the   |.:|      
| 001010101101 |  technical obstacles to FG-Mixing's adoption    |:.|      
| 011011011001 |  in existing, heterogeneous, and widely         |.:|      
| 010011011011 |  distributed systems.                           |:.|      
| 110011010111 |                                                 |.:|      
| 100100001100 |                                                 |:.|      
| 101001100111 |  [back to research]                             |.:|      
| 001101100101 |                                                 |:.|      
|'''''''''''''''-------------------------------------------------'::'. 
|::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::| 
`=======---------.______.-------:::::::-:------------------------. ? | 
                                                                  ---'