95/818b78af08c4c32311c2c8a5847f49caa203ca

   1 Return-path: <libre-riscv-dev-bounces@lists.libre-riscv.org>
   2 Envelope-to: publicinbox@libre-riscv.org
   3 Delivery-date: Sat, 28 Mar 2020 14:08:34 +0000
   4 Received: from localhost ([::1] helo=libre-riscv.org)
   5         by libre-riscv.org with esmtp (Exim 4.89)
   6         (envelope-from <libre-riscv-dev-bounces@lists.libre-riscv.org>)
   7         id 1jIC8W-0004aH-R7; Sat, 28 Mar 2020 14:08:32 +0000
   8 Received: from vps2.stafverhaegen.be ([85.10.201.15])
   9  by libre-riscv.org with esmtp (Exim 4.89)
  10  (envelope-from <staf@fibraservi.eu>) id 1jIC8U-0004aB-Gu
  11  for libre-riscv-dev@lists.libre-riscv.org; Sat, 28 Mar 2020 14:08:30 +0000
  12 Received: from hpdc7800 (hpdc7800 [10.0.0.1])
  13  by vps2.stafverhaegen.be (Postfix) with ESMTP id CAAF811C040B
  14  for <libre-riscv-dev@lists.libre-riscv.org>;
  15  Sat, 28 Mar 2020 15:08:29 +0100 (CET)
  16 Message-ID: <0d35e45bd81eeaecedeb64dc5061c1e33c89630c.camel@fibraservi.eu>
  17 From: Staf Verhaegen <staf@fibraservi.eu>
  18 To: libre-riscv-dev@lists.libre-riscv.org
  19 Date: Sat, 28 Mar 2020 15:08:25 +0100
  20 In-Reply-To: <CAPweEDzAtWoU+wc6MTayF1vtKJvrxLLfP-Q1Czea+NX5MgOrfg@mail.gmail.com>
  21 References: <CAPweEDx5QCCKxSr1gfuyuw_2D68Ld8fK85bEmmMTZi8S3w2E9g@mail.gmail.com>
  22  <29b1a9ecedda151dc9c8da6516c3691dfede62ef.camel@fibraservi.eu>
  23  <CAPweEDwfqMczPjg=5Fvt1J_S8nx1YK44XhyBY8H1abuTNF6=xg@mail.gmail.com>
  24  <6fa40cb78b3f8c013ca4953ccb4daa5c23e3b501.camel@fibraservi.eu>
  25  <CAPweEDxiyTEsneXN65Kq0HsEsdL3wdY=NYayq2tz5egXJNCVfg@mail.gmail.com>
  26  <e430ea6587d292166fd58460adf4dfebfad20c6d.camel@fibraservi.eu>
  27  <CAPweEDzEvtPYGKvGMvebmQzhJDhSgfvUOVZvB2WXxSbv_ebE8A@mail.gmail.com>
  28  <b18283c7e7a93fa8afdef2f0a8679b26e4569528.camel@fibraservi.eu>
  29  <CAPweEDwznLD5o6rHfWsSXR-8e1hbAfAB04f5O+YkL6pCwGsNfQ@mail.gmail.com>
  30  <6fbfb2a3258be77f4fce69661b283dc31a683f7b.camel@fibraservi.eu>
  31  <CAPweEDwf7s=r6bhq6N=VG7QQ1iD4jHYEG6mGvtxL32Uxnhzqwg@mail.gmail.com>
  32  <9e44930a0332eff507661e617796b9d0674b0e05.camel@fibraservi.eu>
  33  <CAPweEDzAtWoU+wc6MTayF1vtKJvrxLLfP-Q1Czea+NX5MgOrfg@mail.gmail.com>
  34 Organization: FibraServi bvba
  35 X-Mailer: Evolution 3.28.5 (3.28.5-5.el7)
  36 Mime-Version: 1.0
  37 X-Content-Filtered-By: Mailman/MimeDel 2.1.23
  38 Subject: [libre-riscv-dev] Clock Gating (was  cache SRAM organisation)
  39 X-BeenThere: libre-riscv-dev@lists.libre-riscv.org
  40 X-Mailman-Version: 2.1.23
  41 Precedence: list
  42 List-Id: Libre-RISCV General Development
  43  <libre-riscv-dev.lists.libre-riscv.org>
  44 List-Unsubscribe: <http://lists.libre-riscv.org/mailman/options/libre-riscv-dev>,
  45  <mailto:libre-riscv-dev-request@lists.libre-riscv.org?subject=unsubscribe>
  46 List-Archive: <http://lists.libre-riscv.org/pipermail/libre-riscv-dev/>
  47 List-Post: <mailto:libre-riscv-dev@lists.libre-riscv.org>
  48 List-Help: <mailto:libre-riscv-dev-request@lists.libre-riscv.org?subject=help>
  49 List-Subscribe: <http://lists.libre-riscv.org/mailman/listinfo/libre-riscv-dev>,
  50  <mailto:libre-riscv-dev-request@lists.libre-riscv.org?subject=subscribe>
  51 Reply-To: Libre-RISCV General Development
  52  <libre-riscv-dev@lists.libre-riscv.org>
  53 Content-Type: multipart/mixed; boundary="===============5623462699142308802=="
  54 Errors-To: libre-riscv-dev-bounces@lists.libre-riscv.org
  55 Sender: "libre-riscv-dev" <libre-riscv-dev-bounces@lists.libre-riscv.org>
  56
  57
  58 --===============5623462699142308802==
  59 Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature";
  60         boundary="=-FBhRFmonkZ2xasih2XbI"
  61
  62
  63 --=-FBhRFmonkZ2xasih2XbI
  64 Content-Type: text/plain; charset="UTF-8"
  65 Content-Transfer-Encoding: quoted-printable
  66
  67 Luke Kenneth Casson Leighton schreef op vr 27-03-2020 om 10:59 [+0000]:
  68 > On Fri, Mar 27, 2020 at 10:36 AM Staf Verhaegen <staf@fibraservi.eu> wrot=
  69 e:
  70 > > Yes and no, it is the basic functionality of a pipeline :(
  71 >=20
  72 > yes.
  73 > > You have the same latency but can have double the number of operations =
  74 in flight.
  75 >=20
  76 > yes.  hence why it is so important to have, because double the numberof o=
  77 perations means that we need double the number of Function Unitsin the Depe=
  78 ndency Matrix in order to keep the entire out-of-orderengine occupied.
  79 > also, double the number of operations in flight means that we needdouble =
  80 the number of Branch Prediction Units, and much more complexBPUs at that, j=
  81 ust to deal with the (now very likely) scenario ofhaving far more overlappi=
  82 ng inner loops "in flight".
  83 > all this from just extending the pipeline length(s) from 5 to 10.  soit's=
  84  not just a "nice-to-have" feature, it's actually really importantto keepin=
  85 g the overall size of the chip down.
  86
  87 There is an (IMO better) alternative for what you are doing with your pass-=
  88 through registers and that is clock gating (wikipedia, allaboutcircuits).
  89 The principle is that you save power by not clocking the parts of the circu=
  90 it that don't have to do any computing. I think this could be a more genera=
  91 l way to only enable the stages in your pipeline who actually are doing com=
  92 putation.
  93 In the above example you would always use a 10 stage pipeline running at 16=
  94 00MHz but to mimic the 5-stage pipeline you only submit an operation every =
  95 other clock cycle and intermittently enable the odd and even stages in your=
  96  pipeline. This way the MUXes are removed from the computation path.
  97 Using a shift register it could be easily generalized to only enable the st=
  98 ages for which there is an operation going through the pipeline. When an op=
  99 eration is submitted you set the first bit in the shift register to enable =
 100 the first stage in the pipeline. With each cycle you then shift this bit so=
 101  the stage that is needed for the execution of that operation is active.
 102 This is generalized power optimization because it means that if you are run=
 103 ning a program that only uses integer operations your FPU and GPU with use =
 104 almost no power.
 105
 106 The way to implement it is using EnableInserter. Some untested code how I t=
 107 hink it can be done:
 108
 109         stages_en =3D Signal(10)
 110         stage1 =3D EnableInserter(stages_en[0])(Stage1())
 111         stage2 =3D EnableInserter(stages_en[1])(Stage2())
 112         ...
 113
 114         m.d.sync +=3D stages_en.eq(Cat(newop, stages_en[0:9]))
 115
 116 That said I think this feature does not fit in the MVP scope of the October=
 117  prototype so that chip should IMO not use clock gating nor the pass-throug=
 118 h register feature from the original discussion. Reason is that implementin=
 119 g it is easier said than done. Several things need to be done:
 120 - You first need a clock gating cell. This is not available in nsxlib and i=
 121 s currently not planned to be implemented. I don't want to commit to someth=
 122 ing extra for the May test chip tape-out either.
 123 - nmigen/yosys needs to properly support clock gating for ASICs. Likely thi=
 124 s means work in yosys that insert the clock gates from if clauses in the RT=
 125 L.
 126 - Your P&R tool (e.g. Coriolis) needs to support the clock gates. It means =
 127 your clock tree synthesis (CTS) needs to support more than just buffers in =
 128 the clock tree. This is not a simple task and has to be discussed with Jean=
 129 -Paul & co.
 130
 131 greets,
 132 Staf.
 133
 134 --=-FBhRFmonkZ2xasih2XbI--
 135
 136
 137
 138 --===============5623462699142308802==
 139 Content-Type: text/plain; charset="utf-8"
 140 MIME-Version: 1.0
 141 Content-Transfer-Encoding: base64
 142 Content-Disposition: inline
 143
 144 X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KbGlicmUtcmlz
 145 Y3YtZGV2IG1haWxpbmcgbGlzdApsaWJyZS1yaXNjdi1kZXZAbGlzdHMubGlicmUtcmlzY3Yub3Jn
 146 Cmh0dHA6Ly9saXN0cy5saWJyZS1yaXNjdi5vcmcvbWFpbG1hbi9saXN0aW5mby9saWJyZS1yaXNj
 147 di1kZXYK
 148
 149 --===============5623462699142308802==--
 150
 151
 152