\begin{itemize}
\item Positional popcount adds up the totals of each bit set to 1 in each bit-position, of an array of input values.
\item Notoriously difficult to do in SIMD assembler: typically 550 lines
- \item https://github.com/clausecker/pospop
+ \item https://github.com/clausecker/pospop - Full writeup: \\
+ https://libre-soc.org/openpower/sv/cookbook/pospopcnt
\end{itemize}
}
\frame{\frametitle{maxloc}
+ \lstinputlisting[language={}]{maxloc.py}
+
\begin{itemize}
- \item "TODO
+ \item FORTRAN MAXLOC - find the index of largest number
+ notoriously difficult to optimally implement for SIMD
+ \item algorithms include \textit{depth-first} recursive
+ descent (!) mapreduce-style, offsetting the
+ locally-computed largest index (plus value) which
+ are then tested in upper level(s)
+ \item SVP64: note below the sv.cmp (first while-loop),
+ sv.minmax. (second while-loop) and the sv.crnand which
+ by Predicate masking is 3-in 1-out CR ops
+ not the usual 2-in 1-out
+ \item There is however quite a bit of "housekeeping".
+ Full analysis: \\
+ https://libre-soc.org/openpower/sv/cookbook/fortran\_maxloc
\end{itemize}
}
+\frame{\frametitle{maxloc assembler}
+
+ \lstinputlisting[language={}]{maxloc.s}
+}
+
\frame{\frametitle{Summary}
\begin{itemize}
\frame{
\begin{center}
- {\Huge The end\vspace{12pt}\\
+ {\Huge The end\vspace{1pt}\\
Thank you\vspace{12pt}
}
\end{center}
\begin{itemize}
+ \item Video of this talk https://youtu.be/fxClvuc2-f8
\item Discussion: http://lists.libre-soc.org
\item OFTC.net IRC \#libre-soc
\item http://libre-soc.org/
\item https://nlnet.nl/project/Libre-SOC-OpenPOWER-ISA
\item https://bugs.libre-soc.org/show\_bug.cgi?id=676
+\item https://bugs.libre-soc.org/show\_bug.cgi?id=1244
\item https://libre-soc.org/openpower/sv/cookbook/fortran\_maxloc
\item https://libre-soc.org/nlnet/\#faq
\end{itemize}