BatchQueue: Fast and memory-thrifty core to core communication

  • Thomas Preud'homme
  • , Julien Sopena
  • , Gaël Thomas
  • , Bertil Folliot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Sequential applications can take advantage of multi-core systems by way of pipeline parallelism to improve their performance. In such parallelism, core to core communication overhead is the main limit of speedup. This paper presents BatchQueue, a fast and memory-thrifty core to core communication system based on batch processing of whole cache line. BatchQueue is able to send a 32bit word of data in just 12.5 ns on a Xeon X5472 and only needs 2 full cache lines plus 3 byte-sized variables - each on a different cache line for optimal performance - to work. The characteristics of BatchQueue - high throughput and increased latency resulting from its batch processing - makes it well suited for highly communicative tasks with no real time requirements such as monitoring.

Original languageEnglish
Title of host publicationProceedings - 22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010
Pages215-222
Number of pages8
DOIs
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010 - Petropolis, Brazil
Duration: 27 Oct 201030 Oct 2010

Publication series

NameProceedings - 22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010

Conference

Conference22nd International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2010
Country/TerritoryBrazil
CityPetropolis
Period27/10/1030/10/10

Fingerprint

Dive into the research topics of 'BatchQueue: Fast and memory-thrifty core to core communication'. Together they form a unique fingerprint.

Cite this