トップ 差分 一覧 Farm ソース 検索 ヘルプ PDF RSS ログイン

Diary/2009-1-4

東京に戻る

新幹線で.
ちょうど新幹線が行ったところでホームに上ったので
丸々1本分並んだことに.
すぐ横でTVカメラが三脚立てて撮影していたので,
きっと微塵も写っていないことでしょう.
窓際の電源席には座れなかったものの,
なんとか無事席を確保して,一路東京へ.
しかし,通路にも人がいっぱいでトイレに行くのも一苦労.
まあ,座っていられるだけ,随分幸せなものです.

Programming with Tiles

cateogryプラグインは存在しません。
2006のPPoPPで提案されたHierarchically Tiled Arrayに
動的分割と重なり合いの二つの新しいクラスを追加した.
記述の容易さとパフォーマンスの比較による評価

HTA

Hierarchically Tiled Arrays(HTAs) are arrays that may be partitioned into tiles. THese tiles can be conventional arays or lower level HTAs. Tiles can be distributed across processors in a distributed-memory machine or be stored in a single machine according to a user specified layout

The C++ implementation of the HTA class is a library with ~18000 lines of code. It only contains header files, as most classes in the library are C:: templates to facilitate inlining.

Dynamic partitioning

cache oblivious algorithms, FLAME require dynamic changes of the tile layout.
part/rmPartを追加.
←TBB requres more lines of code, variables, and data types than the HTA to express the same problem

Overlapped tiling

Stencil codes benefit from tiling, because they increase locality and determine data distribution when running in parallel.
← programmers create a shadow or ghost region around each tile that contains a copy of the elements of the neighbor tiles
← automatically or manually update

Evaluation
  • 性能評価
    • sequential(行列積,LU分解,3D Jacobi)
    • parallel(Parallel Merge, MG/LU NAS)
  • Readability/Productivity
    • the programmng effort[17]
    • the cyclomatic number[22]
    • lines of code

A Portable Runtime Interface For Multi-Level Mmeory Hierarchies

[論文読み]
for moving data and computation through parallel machines with multi-level memory hierarchies
for multi-core/SMP, Cell B.E/分散メモリクラスタ

The Runtime Interface

adaptation of the Sequoia compiler

  • initialize/setup of the machine, including communicaton resources and resources at all levels where tasks can be executed
  • data transfers between memory levels using asynchronous bulk transfers between arrays
  • task execution at specified levels of the machine

  • バルク転送を強化
  • DISKもメモリも同様のインターフェイスで
  • Top APIとBottom API

Multiscalar Processors

[論文読み]
Multiscalar processors use a new, aggressive implementation paradigm for extractign large quantities of ILP from ordinary high level languages programs.