ModelSim & Multithreading

vhdl

ModelSim & Multithreading

Postby Dek » Fri, 08 May 2009 19:39:57 GMT

Hi all,

do you know anything about Multithreading with Modelsim PE student
edition? I'm simulating with a big testbanch and it takes a lot of
time, but I see that vsimk.exe uses just 50% of CPU; since I have a
dual core processor I think that for simulations Modelsim is using
just one core. The question is: is there a setting or something to use
both  cores?


Thanks


Bye

Re: ModelSim & Multithreading

Postby Kim Enkovaara » Fri, 08 May 2009 20:01:36 GMT



There is no such setting, not even in the SE version or Questa. If you
are heavily dumping waveforms etc. then the waveform writing can be
threaded, and at least in the expensive versions it is threaded as a
default.

--Kim

Re: ModelSim & Multithreading

Postby Marcus Harnisch » Fri, 08 May 2009 20:21:42 GMT

Dek < XXXX@XXXXX.COM > writes:


Currently EDA tool makers seem to agree that the overhead of
resynchronizing many individual simulation threads back to the HDL
timing model is simply too big to make it worthwhile.

But I am sure you have more than just one test... Just run simulations
concurrently.

Regards
Marcus

-- 
note that "property" can also be used as syntaxtic sugar to reference
a property, breaking the clean design of verilog; [...]

             (seen on  http://www.**--****.com/ )

Re: ModelSim & Multithreading

Postby Petter Gustad » Sat, 09 May 2009 18:37:37 GMT

Marcus Harnisch < XXXX@XXXXX.COM > writes:


Synopsys VCS claims 2X performance increase on multicore CPU's:

"VCSmulticore technology delivers a 2x verification speed-up that
helps users find design bugs early in the product development cycle.
VCS multicore technology cuts down verification time by running the
design, testbench, assertions, coverage and debug in parallel on
machines with multiple cores."

 http://www.**--****.com/ 


If you can partition the design so that you minimize the latency
intensive communication (this might result in dynamic partition and
process migration) you should be able to archive a decent speed
increase.

An FPGA is a parallel processor where each LE is a small processor
element. However, the latency between each LE is extremely short,
which is not the case for some memory, or even network attached bus.
Hence the challenge is to hide the latency. A very complex problem,
which is very difficult to bolt-on to an existing simulator. Probably
easier to design from scratch, but still very complex. Hopefully we
will see more parallel simulators in the future.


Petter

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Re: ModelSim & Multithreading

Postby Marcus Harnisch » Tue, 12 May 2009 22:06:44 GMT

Petter Gustad < XXXX@XXXXX.COM > writes:


But that's not really simulation threads. Other vendors claim similar
things (wave form dump in a separate thread). 

>> If you can partition the design so that you minimize the latency
>> intensive communication (this might result in dynamic partition and
>> process migration) you should be able to archive a decent speed
>> increase.

I guess the "If" is a significant issue. The analysis might be
costly. But there is more than just this optimization task, which is
difficult enough I gather. Another requirement in simulation is the
capability to rerun a test *exactly* the same way it was executed
before. Having the simulation run in different threads in an
inherently non-deterministic environment (OS, other processes) and
putting these threads into a deterministic execution sequence almost
contradicts itself. I am sure EDA vendors are racking their heads for
a solution to this.

It is much less of an effort to running several simulations in
parallel. You can do that today.

Kind regards
Marcus

-- 
note that "property" can also be used as syntaxtic sugar to reference
a property, breaking the clean design of verilog; [...]

             (seen on  http://www.**--****.com/ )

Re: ModelSim & Multithreading

Postby Petter Gustad » Wed, 13 May 2009 00:14:56 GMT

Marcus Harnisch < XXXX@XXXXX.COM > writes:


Synopsys talks about both Application Level Parallelism (ALP) and
Design Level Parallelism (DLP). The latter is simulation threads. The
former might not be. However, I haven't used this version of VCS so I
can't verify what Synopsys are saying in their FAQ's, press releases
etc.

>>> If you can partition the design so that you minimize the latency
>>> intensive communication (this might result in dynamic partition and
>>> process migration) you should be able to archive a decent speed
>>> increase.
>>
>> I guess the "If" is a significant issue. The analysis might be

Yes. It's easy to imagine a design consisting of two small modules
where one input is fed into the other and vice versa. Both depend upon
the others output and the latency would hurt the performance and you
would probably not split it across two cores/processors.

However a design where a testbench is generating stimuli for the DUT
and the data are all inputs to the DUT it would be feasible to split
the two across multiple processors depending upon the bandwidth of the
data to go from the stimuli generator to the DUT.

The analysis is costly and it might be difficult to determine on
compile time in many cases, e.g. the toggling frequency of some input
might be a function of external data.

>> difficult enough I gather. Another requirement in simulation is the
>> capability to rerun a test *exactly* the same way it was executed
>> before. Having the simulation run in different threads in an
>> inherently non-deterministic environment (OS, other processes) and
>> putting these threads into a deterministic execution sequence almost
>> contradicts itself. I am sure EDA vendors are racking their heads for
>> a solution to this.

I can't see why it's so difficult to keep track of thread statistics
and synchronization points (should probably be simulator option) so
you can re-run the simulation on the same processors etc. but possibly
resulting in lower performance since the loads of the other processors
might be different from the previous runs.


Petter

-- 
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Re: ModelSim & Multithreading

Postby Kim Enkovaara » Wed, 13 May 2009 14:04:58 GMT



In theory the language should protect from this, the event rules are
defined in the LRMs. And I think VHDL will be easier in this context
because it is harder to do hazards in VHDL.

But for Verilog this will be a real problem. Many commercial
behavi{*filter*}verilog models very often assume many things that are not
guaranteed by the LRM (order of process execution etc.). Quite often
"-keep_delta -compat" flags are needed in modelsim to get verilog models
to work.

But I would guess that the parallel execution will be done at least on
design unit level. And most problematic language uses are usually
contained inside design units, and the perimeter is pure synchronous
logic.

--Kim

Re: ModelSim & Multithreading

Postby Marcus Harnisch » Wed, 13 May 2009 21:39:10 GMT

Kim Enkovaara < XXXX@XXXXX.COM > writes:


Likely. But honestly, who cares.


That is the point. EDA tools are written with Verilog in mind. It has
got to work with Verilog and, worse because of additional timing
phases, SystemVerilog. Everything else is secondary to EDA vendors.

Nobody would implement such a feature for VHDL, saying that Verilog
support was planned for a future release.

Regards
Marcus

-- 
note that "property" can also be used as syntaxtic sugar to reference
a property, breaking the clean design of verilog; [...]

             (seen on  http://www.**--****.com/ )

Similar Threads:

1.Queues and multithreading

Yes you will. The queue is itself the problem and the reference
variable is just a layer away from the same problem. A global critical
section will work and is not overwhelmingly difficult it just takes a
lot of hand code to encapsulate the logic you need to access the
queue.

On 3 May 2010 06:37:32 -0400, "Vadim Berman"
<vadim.bermanATATATdigitalsonata.com> wrote:

>I guess there's no workaround, I'll have to control the access to the queue 
>with the multithreading torture tools.
---------------------------------------
 Paul Blais -  Hayes, Virginia

2.intro, intermediate concurrent programming / multithreading

I know everybody in group has had to go through this transition at one
point, becoming more comfortable with concurrent programming.

I've seen the basics, am familiar a bit with facilities in mzscheme.
i'm currently using sisc for a project. i've read sicp in the past and
its discussions of mutex's, state.

just looking for one or two high quality recommendations for textbooks
or technical books on the subject. more from a perspective of good
techniques and general ideas. doesn't have to use scheme, but
preferred. need it to cover multiple approaches, can't only cover one
approach.

Checked some faq's, searched this group already, surprising didn't
find anything

my best guesses are the declarative concurrency, streams, and stateful
concurrency chapters in Concepts, Techniques, Models.... or one of the
books on concurrent ml or erlang.

appreciate it

3.Question about SRFI 18: Multithreading support

What is the motivation for having all threads terminate when the
primordial thread terminates? This seems like an arbitrary restriction.

Thanks,
Michael.

4.streams & multithreading & memoization

Hello schemers,

  I'm interested in sharing SRFI-40 streams across (SRFI-18) threads.
  The data I'm streaming is generated by external programs, and I would
like it to run  semi-asynchronously with my scheme program.
  The problem: if I do something like
  (define (gen-stream src)
    (stream-delay (stream-cons (get-next-value src) (gen-stream src)))
) )

and I'm sharing the stream across N threads which are *simultaneously*
doing
(stream-car <stream>) (e.g. if (get-next-value src) is blocking), then,
although
everyone will receive the same elements (memoization), N-1 elements
will be missed (ignored) by the stream-delay.


  Now I suppose I could change the stream accessors to use mutex's and
only let one thread have access at a time -- that is, only let one
thread force the stream-delay at a time.
  But does anyone have better ideas?

  One thing I've tried is collecting continuations from everyone who
accesses a blocked stream, calling them in succession when I get a new
value to stream.  But I'd rather use threads..
  Another thing I considered was doing something like:
	(define (get-element id)
	  (if (already-got? id) (element-ref id)
	      (begin
		(get-next!)
		(get-element id) ) ) )

        (define (get-stream id)
	  (stream-delay (stream-cons (get-element id) (get-stream (+ id 1)))
	   ) )

where (get-next!) blocks until a mutex signals a new result may be
accessible
via (element-ref <index>).

But then I have to save the entire stream, since I never know when the
memoization is complete.

So I'd have to...
	(define (get-element id)
	  (if (already-got? id)
              (if (already-memoized? id) #f (element-ref id))
	      (begin
		(get-next!)
		(get-element id) ) ) )

        (define (get-stream id)
	  (stream-delay (stream-cons (get-element id) (get-stream (+ id 1)))
	   ) )

	(define (real-stream id)
          (stream-delay
            (let ([s (stream-force (get-stream id))])
              (set-already-memoized?! id #t)
              s) ) )

Which would allow me to release the values once they were memoized.

Isn't there an easier way?
Maybe I've just been thinking about it too much. :)

thanks,
Daniel Faken

5.question about tasks, multithreading and multi-cpu machines

In Ada it is possible to declare multiple parallel running "tasks". But for 
my opinion the keyword "task" is somewhat misleding because in fact, those 
"tasks" are really threads.

If I run such a program on a multi-cpu machine, the process itself will use 
only one cpu, even though I create several "tasks".

I tested this with gnat v3.15p under HPUX 11 on a multi-cpu server.

How can I write my code to utilize all cpu's on such a machine? Is there a 
different way in Ada to perform multi-tasking instead of multi-threading?

Thank you for your help!

Best regards, Norbert

6. multicore-multithreading benchmarks

7. multicdore-multithreading benchmarks

8. [Fwd: Free book: Multithreading]



Return to vhdl

 

Who is online

Users browsing this forum: No registered users and 73 guest