Agner`s CPU blog

Software optimization resources | E-mail subscription to this blog | www.agner.org

 
thread Function libraries and tools updated - Agner - 2013-10-07
reply asmlib virus hit - mathew boorman - 2013-11-03
replythread Function libraries and tools updated - Yury Hushchyn - 2014-10-17
last reply Function libraries and tools updated - Agner - 2014-10-18
replythread Function libraries and tools updated - Andysem - 2015-03-05
last reply Function libraries and tools updated - Agner - 2015-03-05
last replythread Function libraries and tools updated - Travis Downs - 2019-05-03
last reply Function libraries and tools updated - Agner - 2019-05-04
 
Function libraries and tools updated
Author: Agner Date: 2013-10-07 10:02

My vector class library, asmlib library, random number generator library, objconv tool, and test programs have now been updated as explained below.

Vector class library

The vector class library is a collection of C++ classes, functions and operators that makes it easier to use the the vector instructions of modern CPUs without using assembly language. Main improvements in latest version:

  • Support for Clang compiler
  • Clear distinction between boolean vectors and integer vectors for the sake of compatibility with future AVX512 instruction set
  • Added function if_add
  • Minor bug fixes
  • Workarounds for various problems in specific compilers

Link to vector class library.

Asmlib library

This is a library of optimized subroutines coded in assembly language. Supports many different compilers and operating systems. This library contains faster versions of common C/C++ memory and string functions, fast functions for string search and string parsing, fast integer division and integer vector division, random number generators, and several other useful functions not found elsewhere. Main improvements in latest version:

  • New function: memcmp
  • memcpy, memmove and memset functions updated with optimizations for newest processors, including Intel Haswell and AMD Piledriver.
  • Various random number generators included. These were previously in a separate library.

Link to asmlib library.

Random number generator library

This is a collection of pseudo random number generators for demanding scientific applications including various continuous and discrete distributions. Includes C++ class libraries and binary library files. A non-deterministic random number generator function is added for use with microprocessors that have a built-in physical random number generator. The binary library files have now been integrated with the Asmlib library in order to explore the synergy between the two libraries and make maintenance easier.

Link to random number generator library.

Objconv tool

This utility can be used for converting object files between different object file formats for all 32-bit and 64-bit x86 platforms. Can modify symbol names in object files. Can build, modify and convert function libraries across platforms. Can dump object files and executable files. Also includes a very good disassembler supporting the latest instruction sets. Main improvements in latest version:

  • Faster handling of large libraries
  • Better handling of library members with long names
  • Can change prefixes and suffixes of function names in object files and library files
  • Disassembler supports future instruction sets, including AVX-512
  • Various bug fixes

Link to objconv tool.

Test programs

This is a collection of test programs that I have used for my research. Can measure clock cycles and performance monitor counters such as cache misses, branch mispredictions, resource stalls etc. for small pieces of code. This has been updated to support the latest microprocessors, including Intel Haswell and AMD Piledriver. A lot of test scripts have been added for automated tests of instruction latencies and throughputs, instruction fetch and decode rates, data cache, microop cache, store forwarding, and many other details.

Link to test programs.

   
asmlib virus hit
Author:  Date: 2013-11-03 17:24
Hi, just found your work. Awesome stuff!

Just FYI, when I downloaded asmlib.zip, VirusTotal reported one of the 30ish AV's hit against it.

I'll add a tag it as a false hit at virustotal.com, but you might want to push back to the AntiVir vendor if possible.
- AntiVir (7.11.110.204, 20131103): HEUR/ELF.Malformed

   
Function libraries and tools updated
Author:  Date: 2014-10-17 17:23
Was using your great collection of test programs for microbenchmarking, really useful. Especially pmctest and timingtest.

At first stage was getting random negative readings of PMCs but then figured out that sometimes my code fragment migrates to another core and PMC readings get inconsistent, causing jumpy and misleading results, so adding notice on affinity setting and cross-core thread migration to "7.23 Why do I get negative counts?" may be helpful for people getting similar issues.

Yury

   
Function libraries and tools updated
Author: Agner Date: 2014-10-18 01:58
Yury Hushchyn wrote:
sometimes my code fragment migrates to another core and PMC readings get inconsistent
The PMCTest will fix the code to a specific CPU core, but with the timingtest you are on your own.
   
Function libraries and tools updated
Author: Andysem Date: 2015-03-05 03:03
Hi, Agner.

I was looking at your asmlib code, memcpy in particular, and was wondering why ERMSB (the improved "rep movsb" in Ivy Bridge and later) is not used? Does it not offer any performance improvement over the current version or you didn't have a chance to add support for it?

I was also wondering if unrolling the main loop in memcpy with AVX2 could improve performance a little.

Thanks.

   
Function libraries and tools updated
Author: Agner Date: 2015-03-05 11:37
Andysem wrote:
wondering why ERMSB (the improved "rep movsb" in Ivy Bridge and later) is not used?
I have tested it and the AVX2 implementation is faster in most cases.
Unrolling loops takes extra space in the code cache and rarely gives any significant improvement in speed.
   
Function libraries and tools updated
Author:  Date: 2019-05-03 22:07
Can you the objconv tool be told to output only a specific function?

I understand that function boundary determination is not exact, but the comments in the diassembly say "xxx End of function" so objconv obviously at least has taken a guess as to where the function ends.

It would be nice to leverage that into just seeing the output for a specific function.

   
Function libraries and tools updated
Author: Agner Date: 2019-05-04 00:04
Travis Downs wrote:
Can the objconv tool be told to output only a specific function?
No. You can just search for the function name in the disassembly output