Hypothesis

17 Matching Annotations

Sep 2017
www.globus.org www.globus.org

Research data management simplified. | globus

1
1. IanMulvany 21 Sep 2017
  
  in Public
  
  data research cloud big data microsoft grid hpc
Visit annotations in context

Tags

research

microsoft

big data

hpc

cloud

data

grid

Annotators

IanMulvany

URL

globus.org/research-data-management-simplified
singularity.lbl.gov singularity.lbl.gov

Singularity

1
1. IanMulvany 21 Sep 2017
  
  in Public
  
  Singularity containers can be used to package entire scientific workflows, software and libraries, and even data.
  
  Very interesting, basically Singularity allows containers to run in HPC environments, so that code running in the container can take advantage of the HPC tools, like massive scale and message passing, while at the same time keeping the stuff in the container safer.
  
  HPC science software tools docker containers
Visit annotations in context

Tags

tools

science

HPC

docker

software

containers

Annotators

IanMulvany

URL

singularity.lbl.gov/
Aug 2017
queue.acm.org queue.acm.org

NUMA (Non-Uniform Memory Access): An Overview - ACM Queue

9
1. bbarker 14 Aug 2017
  
  in Public
  
  If zone reclaim is switched on, the kernel still attempts to keep the reclaim pass as lightweight as possible. By default, reclaim will be restricted to unmapped page-cache pages. The frequency of reclaim passes can be further reduced by setting /proc/sys/vm/min_unmapped_ratio to the percentage of memory that must contain unmapped pages for the system to run a reclaim pass. The default is 1 percent.
  
  This is a percentage of the total pages in each zone. Zone reclaim will only occur if more than this percentage of pages are in a state that zone_reclaim_mode allows to be reclaimed.
  
  If zone_reclaim_mode has the value 4 OR'd, then the percentage is compared against all file-backed unmapped pages including swapcache pages and tmpfs files. Otherwise, only unmapped pages backed by normal files but not tmpfs files and similar are considered.
  
  Source
  
  Linux HPC NUMA
2. bbarker 14 Aug 2017
  
  in Public
  
  There is a knob in the kernel that determines how the situation is to be treated in /proc/sys/vm/zone_reclaim. A value of 0 means that no local reclaim should take place. A value of 1 tells the kernel that a reclaim pass should be run in order to avoid allocations from the other node. On boot- up a mode is chosen based on the largest NUMA distance in the system.
  
  This appears to be /proc/sys/vm/zone_reclaim_mode now.
  
  correction HPC NUMA Linux
3. bbarker 14 Aug 2017
  
  in Public
  
  There has been some recent work in making the scheduler NUMA-aware to ensure that the pages of a process can be moved back to the local node, but that work is available only in Linux 3.8 and later, and is not considered mature.
  
  Stamped2 KNL nodes are already running 3.10, so this is likely available.
  
  KNL HPC NUMA Linux
4. bbarker 14 Aug 2017
  
  in Public
  
  The active memory allocation policies for all memory segments of a process (and information that shows how much memory was actually allocated from which node) can be seen by determining the process id and then looking at the contents of /proc/<pid>/numa_maps.
  
  Linux HPC NUMA
5. bbarker 14 Aug 2017
  
  in Public
  
  How memory is allocated under NUMA is determined by a memory policy. Policies can be specified for memory ranges in a process's address space, or for a process or the system as a whole. Policies for a process override the system policy, and policies for a specific memory range override a process's policy.
  
  NUMA HPC
6. bbarker 14 Aug 2017
  
  in Public
  
  The main performance issues typically involve large structures that are accessed frequently by the threads of the application from all memory nodes and that often contain information that needs to be shared among all threads. These are best placed using interleaving so that the objects are distributed over all available nodes.
  
  NUMA HPC
7. bbarker 14 Aug 2017
  
  in Public
  
  In general, small Unix tools and small applications work very well with this approach. Large applications that make use of a significant percentage of total system memory and of a majority of the processors on the system will often benefit from explicit tuning or software modifications that take advantage of NUMA.
  
  NUMA HPC
8. bbarker 14 Aug 2017
  
  in Public
  
  Modern processors have multiple memory ports, and the latency of access to memory varies depending even on the position of the core on the die relative to the controller. Future generations of processors will have increasing differences in performance as more cores on chip necessitate more sophisticated caching.
  
  NUMA HPC KNL
9. bbarker 14 Aug 2017
  
  in Public
  
  A memory access from one socket to memory from another has additional latency overhead to accessing local memory—it requires the traversal of the memory interconnect first.
  
  NUMA HPC
Visit annotations in context

Tags

NUMA

KNL

HPC

Linux

correction

Annotators

bbarker

URL

queue.acm.org/detail.cfm
eklitzke.org eklitzke.org

Misunderstanding mlock(2) and mlockall(2)

1
1. bbarker 14 Aug 2017
  
  in Public
  
  Some people think that these system calls are a good way to improve the performance of a high-performance process on a system. A common use case I’ve seen in the real world is to try to call mlockall() on a program that’s supposed to running with very high performance. The reasoning is that if the program is paged out to disk, that will reduce performance; therefore mlockall() will improve things.If you try to actually use mlockall() in this way you might run into some difficulties because most systems have a very low default ulimit on the number of pages a process can lock. With some twiddling of the default ulimits you can get this working, but perhaps it’s worth considering why the default ulimits are so low in the first place.
  
  HPC memory
Visit annotations in context

Tags

memory

HPC

Annotators

bbarker

URL

eklitzke.org/mlock-and-mlockall
Jul 2017
www.nextplatform.com www.nextplatform.com

The System Bottleneck Shifts To PCI-Express

1
1. bcj 20 Jul 2017
  
  in Public
  
  evolution from PCI 1.0 through PCI-Express 5.0
  
  While the evolution of PCIe speed is definitely of interest, especially as it keeps pace with network speeds, the total number of PCIe lanes also a significant barrier to I/O for many systems... Especially in HPDA.
  
  We can effectively double network throughput by dropping in another 16x NIC. This becomes less possible if there are not enough slots (or perhaps more importantly if available PCIe lanes are oversubscribed). This becomes even more of an issue, as the author points out, with the advent of NVMe.
  
  Intel has a vested interest in keeping the number of PCIe lanes at 40 with Xeon and holding back implementation of PCIe 4.0. They provide proprietary high speed I/O to their Xeon Phi coprocessor and Optane memory products. This doesn't allow GPUs, FPGAs and competing NV memory products to compete on equal footing.
  
  AMD is somewhat breaking the stalemate with Zen Naples offering 128 PCIe 3.0 lanes. Will have to see if OEMs build systems that expose all of that I/O.
  
  PCI HPDA HPC Intel AMD
Visit annotations in context

Tags

PCI

HPC

Intel

AMD

HPDA

Annotators

bcj

URL

nextplatform.com/2017/07/14/system-bottleneck-shifts-pci-express/
Jun 2016
onsnetwork.org onsnetwork.org

Docker - Improving Roberts Lab Reproducibility

1
1. Gravios 03 Jun 2016
  
  in Public
  
  Docker is a type of virtual machine
  
  How does it compare to the packages installed directly? Could be useful for development, but maybe not practical for HPC applications. Maybe just create a cd iso with all the correct programs and their dependencies.
  
  Docker virtual machine HPC
Visit annotations in context

Tags

Docker

virtual machine

HPC

Annotators

Gravios

URL

onsnetwork.org/kubu4/2016/06/01/docker-improving-roberts-lab-reproducibility/
May 2014
nsfcac.rutgers.edu nsfcac.rutgers.edu

Untitled document

1
1. aculich 12 May 2014
  
  in Public
  
  Specifically, we explore three key usage modes (see Figure 1): • HPC in the Cloud , in which researchers out - source entire applications to current public and/ or private Cloud platforms; • HPC plus Cloud , focused on exploring scenarios in which clouds can complement HPC/grid re - sources with cloud services to support science and engineering application workflows—for ex - ample, to support heterogeneous requirements or unexpected spikes in demand; and • HPC as a Service , focused on exposing HPC/grid resources using elastic on-demand cloud abstrac - tions, aiming to combine the flexibility of cloud models with the performance of HPC systems
  
  Three key usage modes for HPC & Cloud:
  
  HPC in the Cloud
  
  HPC plus Cloud
  
  HPC as a Service
  
  BRC HPC Cloud usage modes
Visit annotations in context

Tags

usage modes

BRC

HPC

Cloud

Annotators

aculich

URL

nsfcac.rutgers.edu/CometCloud/sites/nsfcac.rutgers.edu.CometCloud/files/pub/06530588.pdf
www.npr.org www.npr.org

Untitled document

1
1. Spence 06 May 2014
  
  in Public
  
  coding code tumors software AI MachineLearning patterns biocompute cancer HPC
Visit annotations in context

Tags

code

cancer

patterns

MachineLearning

biocompute

HPC

software

AI

coding

tumors

Annotators

Spence

URL

npr.org/blogs/health/2014/05/06/309003098/chemist-turns-software-developer-after-sons-cancer-diagnosis
Apr 2014
www.exascale.org www.exascale.org

Untitled document

1
1. aculich 28 Apr 2014
  
  in Public
  
  Over the last twenty years, the open source community has provided more and more software on which the world’s High Performance Computing (HPC) systems depend for performance and productivity. The community has invested millions of dollars and years of effort to build key components. But although the investments in these separate software elements have been tremendously valuable, a great deal of productivity has also been lost be cause of the lack of planning, coordination, and key integration of technologies necessary to make them work together smoothly and efficiently, both within individual PetaScale systems and between different systems. It seems clear that this completely unco ordinated development model will not provide the software needed to support the unprecedented parallelism required for peta/exascale computation on millions of cores, or the flexibility required to exploit new hardware models and features, such as transact ional memory, speculative execution, and GPUs. This report describes the work of the community to prepare for the challenges of exascale computing, ultimately combing their efforts in a coordinated International Exascale Software Project.
  
  HPC Exascale computing
Visit annotations in context

Tags

HPC

Exascale computing

Annotators

aculich

URL

exascale.org/mediawiki/images/2/20/IESP-roadmap.pdf

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL