Useful Tools

contact us contact tutor/group email to a friend accessibility options report a fault
Engineering Design

Engineering Design

AMI4900: Engineering Design

Unit 1: Commercial Overview

Section 5: High level partioning


Contents

1.5.1 Partitioning and modularity

When we have taken the case off of an elctronic product such as a PC, and removed some of the mechanical assemblies protecting the circuitry we can see the major components of the product. At this stage you will see clearly that the computer is not just a single circuit. Instead, the entire circuit function has been ‘partitioned’, separating it into different areas, to which different techniques have been applied in the interest of both functionality and cost.

The first level of partitioning is functional separation, for example, taking high voltage/high power elements in the power supply and ‘realising’ this as a separate module.

The concept of modularity is important in the computer industry, and also in applications such as test equipment. Although less apparent, there is usually some degree of modularity at the design level in most applications, if only to allow design reuse – if you have a circuit element which works well, and need the same function in a new design, why reinvent the wheel?

For the computer, modularity is also useful for the reasons that:

Circuit partitioning also has to take account of mechanical constraints, such as those resulting from product styling choices. For example, the main part of the circuit may have to be positioned within a volume which does not allow power or output elements to be integrated.

Circuit partitioning, as with the choice of certain kinds of enclosure, will also be affected by the existence of external standards, both for the industry and at a component level.

Circuit partitioning has an impact on the technology which is used – one can generally integrate all the components, but doing so needs a higher level of technology and higher cost than allowing a little more space and partitioning thecircuit into modules. The consequence is that partitioning impacts substantially both on the cost of manufacture, and on the extent of the design task, and hence on the time to market.

[ back to top ]

 

1.5.2 Choosing the correct technology

The complex world of electronic design is not immune to Moore’s law. In 1965, Intel co-founder Gordon Moore saw the future. His prediction popularly known as Moore 's Law, states that the number of transistors on a chip doubles about every two years. A consequence of this is that the density of the components is ever increasing as well as the functionality of the component parts. There are major elements in the development of electronic circuits and simply they can be defined as being in the following basic areas:

With the increasing complexity of silicon design coupled with its growing capacity in transistors that can be incorporated into a single silicon substrate there is an emerging area of electronic design that incorporates one or more of each of the discrete elements into a particular design. When this is taken to its extreme then all the components of an electronics design are incorporated onto a single silicon chip, this is known as a 'System on a Chip' (SoC). This gives very powerful electronic systems that can have a small foot print in regard of printed circuit board substrate. Depending upon the silicon technology used for the semiconductor SoC then a reduction in operating voltages and coupled with a lower in power consumption (operating temperature and heat dissipation characteristics) can be achieved.

The development costs for SoC devices is consequently high and much care has to be taken to develop a thoroughly rigorous implementation of the user requirements. In practice although there is thrust to high integration on a single chip most designers and manufacturers will not have the design capabilities, either in terms of staff expertise or financial resources. The return on investment required by companies will have to achieved in quite a short time scale that may not be feasible for the company.

Complex designs will probably be constructed from advanced individual chips that are specified for specific purposes for analogue designs including Programable SoC (known as PSoC)) or logic based devices such as Field Progammable Gate Arrays (FPGAs) that can be used to construct complex digital circuitry to give high performance.

The following diagram gives a simple overview of semiconductor devices that are programmable circuits or known as Application Specific Integrated Circuits (ASIC). There are two major subsets for ASICs, logic based devices and processor based devices. Processor based devices have grown in usage since they were first introduced benefiting from the growth in processing power due to reduced cell geometries. The processor based devices can also be sub divided into microcontrollers and Digital Signal Processing (DSP) microcontrollers. PSoC tend to be devices that have sub-sets of analogue circuits together with a microcontroller that can be user defined to meet specific analogue circuit requirements for signal processing usually before being digitised for onward digital manipulation. Care still has to be taken with these circuits to ensure adequate analogue performance as the noise margins and floors are not yet as good as discrete analogue devices.

Figure 1.5.1

Outline of Semiconductor Devices

 

The microcontroller is part of the family of programmable chips. It is a microprocessor which is integrated into a single IC package along with a range of useful functions. A microcontroller shares in common with a microprocessor the ability to execute a stored set of instructions in order to carry out a user defined task. It differs from a logic based device in that instead of having a dedicated logic structure for each task, the same general purpose hardware is constantly reconfigured by its software instructions to perform different operations on the available data.

The microprocessor is aimed at computationally intense tasks which generally require large amounts of fast memory, whilst the microcontroller is more self contained and has a wider variety of I/O resources. To support the expansion of resources beyond those available on the chip its self, a microcontroller may include the "hooks" to make such expansion easy. This is called expanded mode operation.

Typical resources integrated within a microcontroller include data and program memory, timers, analogue to digital converter, digital to analogue converters, pulse width modulation circuits, and general purpose I/O ports.

Key considerations when working with microcontrollers

When we look at the resources that can placed within a microcontroller device they will usually have been combined together to provide in the smallest physical size the maximum performance. When they are presented they have the following capabilities:

For the devices the following names are used:

The table below indicates a set of possible tasks the elctronic system maybe used and indicates which type of device is probably most appropriate.

 

Characteristics of tasks against possible implementation technology
Characteristic
µP
µC
DSP
ASIC
A soft specification - a number of features for which there are a large number of options.

X

A complex user interface, often involving the use of a display and keypad. The logging and manipulation of large quantities of data.
X
Complex heuristic algorithms (heuristic - a process of guided trial and error; algorithm - the logical sequence of steps needed to carry out a task).
X
Most of the requirement can be met without use of expanded mode or by simple additional support hardware.
X
High speed and numerically intensive
X
Large memory requirements
X
Complex but well defined fixed function
x
Large requirement for either memory or additional I/O that makes extensive use of expanded mode and additional off chip resource
x

A problem which frequently occurs in electronic engineering design applications is deciding which functions should be carried out in software and which in hardware. The problem is best explored by examples. Figure 1.5.2 shows an infrared laser transmitter that was designed as part of an infantry tactical training system. The microcontroller system is located within the optical sight of an automatic rifle. The piezo accelerometer monitors vibrations within the rifle and when a blank shot is fired the laser transmits a coded pulse train. This is detected by sensors on harnesses worn by other participants in a simulated tactical engagement. A microcontroller in the harness computes the centre of the transmitted beam. Thus the point of aim of the transmitter can be determined in real-time.

Figure 1.5.2

Infrared Laser Transmitter

In the first generation of the transmitter both the tone generator and the high voltage power supply were built from discrete components. In the second generation product the microcontroller had a built in counter timer block. In order to reduce component count and save cost and space, the microcontroller took over both the high voltage power supply function and the tone generator. The software tone generator proved both very easy to code and very useful in that it allowed a large range of different sounds to be generated which could be used for fault diagnosis. The software power supply proved more complex. It switched the drive FET in the fly back generator directly and monitored the output voltage via a comparator. This created a digital feedback loop which put a considerable strain on the combined software / hardware resources of the simple microcontroller. The additional diagnostic benefits gained did not outweigh the additional software complexity.

The lesson to be learned is that just because it can be done in software does not mean that it should. In very high volume designs it will pay to trade off software design effort against reduced component count. In the extreme microcontroller cores can be used in an ASIC to produce a single chip solution. In low volume applications it is not usually sensible to mimic off-the-shelf commodity hardware functions, or to attempt well defined time critical functions that would be difficult to code and stretch the microcontroller hardware resource when it could be implemented in an external self contained hardware block (perhaps an FPGA).

This is a simple example high level partioning.

To develop a strategy for high level partioning it is necessary to have established a robust user requirements that completely specify and encompass requirement objectives of the proposed system. This has be written in the language of the user and has to specify the peformance criteria that is expected. Important considerations from the view point of high level partioning are:

It is then a requirement that the team considering the system specification understand the alternative techniques and technologies that are available to them. That they understand the trade of between performance and costs both in final equipment costs and the development costs that will have to be incurred.

[ back to top ]

 

1.5.3 High level partioning

High level partioning is the logical distribution of resorces and processing power to meet the requirements of a user specification. This is a critical stage in the design process. Many engineers, scientists and academics are extremely confident in their capabilities to implement a design using modern software development packages. They are however not too good at managing the inteface to the user who has prepared a defined requirement. The danger is that both sides of the discussion will not fully comprehend each others desires, capabilities and solutions. A great effort is needed to manage this task successfully.

Large electronic systems will consist of a range of building blocks that we have talked of above. They can be brought together to perform a functional unit using single or multiple printed circuit board systems. This depends on the mechanical design and how much product volume is available for the product. They can also be developed using single chip soulutions known as a System on a Chip (SoC). In more complex designs it may not be possible to use one single chip and it will then be necssary to partion the design into sets of function that can be integrated into a number of chips or a 'chip set' for the particular application.

This is high end technology and may not be suitable for all applications. But because the SoC is a single chip the integration of the system has to be evaluated throughly and be proved to be a valid functional system before the chip is completed. If it is not then the design may have to be reworked a number of times to gain the desired result. This design iteration can be extremly expensive if the errors are only discovered after fabrication. We will therefore consider these complex systems as SoC to provide a viable term.

Many SoCs are currently pushing an average of five to ten million-gate count . In addition, many contain a huge amount of software being processed by a DSP and/or general-purpose processor.

A practical methodology to intelligently partition a system between hardware and software is required.

Introduction

A central challenge in designing complex devices is creating an architecture that partitions the hardware and software functionality optimally by considering hardware constraints and costs such as area, power, and system speed. Often, a system has specific performance goals, but other limitations (such as cost or need for programmability) restrict the amount of the design that can be put into silicon. This requires a tradeoff analysis of silicon area vs. performance, with part of the system ending up in hardware and the remainder in software.

The process of hardware and software partitioning is generally somewhat one-sided. In many devices the DSP’s performance and MIPS budget are the gating factors for deciding what will be hardware and what will be software. That is, if the software will not be fast enough or if it will use up too many MIPS, it must go in hardware. Unfortunately, this decision is often made without regard the hardware design considerations such as area, power, and performance.

Estimates are often wrong causing significant costs later in the design process.

The idea of intelligently partitioning between hardware and software is not new, dating back 12 or more years. The basic problem of partitioning is well-defined.

In short, any given system can be thought of as composed of a number of fundamental units that will never be split across hardware and software. These fundamental units range in size from single instructions to basic blocks to entire processes, depending on the approach taken. Because the size of these fundamental units varies widely in the literature, we will use the generic term “chunk” to refer to a fundamental unit.

A set of hardware and software implementations is created or assumed present, so that each chunk has both a hardware and software implementation. Then, a cost metric is developed for both hardware and software implementations, and the problem can be solved via a hill-climbing heuristic.

It should be mentioned that hardware-software partitioning is only one part of the overall hardware-software co-design solution. Nonetheless, the academic maturity of this problem has yet to be leveraged in most real-world design flows.

Existing partioning model

Although a mature, well-defined, systematic methodology is often absent in the design of today’s systems, it is not to say that the issue of partitioning is not addressed. It is, in fact, addressed, but in an ad hoc manner. The existing methodology, discussed below, is true for both ad hoc and more systematic approaches.

To discuss hardware-software partitioning effectively, we must first place it in the context of an overall co-design approach. Again, this could be a specific systems or a complete ad hoc approach. Regardless, the same decisions must be made.

A general approach to partitioning, as done today, is shown in Figure 1.5.2.

Figure 1.5.2

Current approach to hardware-software partitioning.

All systems start with a specification, which could be represented on paper, as executable code , or in another form (such as Esterel or SpecCharts ). See

http://www.eetimes.com/editorial/1998/rtlfeature9807.html

http://www.cse.iitb.ac.in/~ramesh/IT-606-ramesh-L1.ppt#646,2,Models and Tools for Embedded Systems S. Ramesh

http://ls12-www.cs.uni-dortmund.de/~marwedel/kluwer-es-book/es-marw-2f-c.ppt

The specification is then fragmented into “chunks” that can be considered for either hardware or software. In ad hoc approaches, the chunks are usually processes or functions, while the automated approaches in the literature may consider chunks as small as basic blocks or even instructions

Cost estimates are created for each chunk, based on the relevant cost metrics for both a software and hardware implementation of the chunk. The exact cost metrics vary from design to design (and approach to approach), but may consider performance, area (usually only for hardware), power, etc. More complex but accurate cost functions also account for the communication between hardware and software

Finally, the partitioning is selected, and the hardware and software are created (often by separate teams). Also, the interfaces must be chosen and created at this time.

System Specification

Although a paper specification is by far the most common starting point, an executable specification, that is, a specification written in a language that can be compiled and run as an executable, is becoming commonplace for many types of complex systems. This has many advantages, the greatest of which is an unambiguous representation of the system that all implementations can be concretely measured against.

Given that an executable specification is the preferred type of specification, from a pragmatic point of view, the language chosen should be one that can easily represent both software and hardware. Furthermore, if a more automated methodology is desired, the language should support both software and hardware synthesis. Given these requirements, C and C++ are logical starting points, as they can be directly used as software implementations, and/or augmented with SystemC constructs so high-level synthesis can be used to generate hardware gates.

Fragmenting the system

Much research has gone into partitioning of various sized chunks, ranging from the instruction level up through processes. Fine-grained partitioning will be a significant advantage in co-design systems of the future.

However, from a practical, non-theoretical point of view, system designers often know the granularity with which they will partition. For example, in DSP and graphics systems, the design often very logically decomposes into blocks of reasonable size.

While it is foolish to assume that no additional benefit could ever be gained by looking at smaller pieces of the design, we can safely, if not optimally, rely on the expertise of the designer (and natural composition of the system) to fragment the system for us.

Cost estimates and synthesis

We combine the steps of estimating costs and synthesis because they can be done hand-in-hand. Clearly, that is already the case with getting performance estimates for software, as that can be easily done by compiling and executing. The technology is now available to provide a similar approach for cost estimates of candidate hardware portions of the system.

First, however, we must correct a basic assumption about partitioning. Earlier, we mentioned that a hardware implementation is created or assumed present for each chunk. However, in many cases, that is a grossly insufficient model. To truly optimize the system, multiple hardware implementations must be considered for each chunk. For example, there could be a high performance implementation, a low performance implementation, and any number of implementations in between those extremes. Each of those implementations should be considered when determining the hardware-software partitioning of the system. The addition of this one-to-many mapping of chunks to hardware implementations, and accurately determining the relative cost and benefit of each, allows better system architecture exploration.
Of course, this also complicates the problem, by requiring the ability to determine cost metrics for multiple hardware implementations from a single specification. Also, since there are now multiple implementations being judged against each other, the cost metrics (whatever they are) must be fairly accurate. The best way to get accurate estimates is to make measurements on real RTL code.

Since practical high-level synthesis from C++ (SystemC) exists today, this is the preferred path to getting the desired cost metrics. RTL can be quickly created for a number of implementations, and then very accurate cost metrics (such as latency, area, etc.) can be extracted from the RTL.

This approach provides more accurate estimates as well as the ability to do more extensive system exploration by considering multiple hardware implementations of each chunk. These limitations may be why systematic hardware-software partitioning is noticeably absent from many system-level design flows.

Pragmatic Partitioning Model

The refined, pragmatic hardware-software partitioning model is shown in Figure 1.5.3.
The system is represented as an executable specification in C or C++, and the designer determines the relevant chunks to be considered for hardware or software implementations. For example, in a DSP design, this is likely the DSP “building blocks” (FFT’s, filters, etc.), while in graphics it would likely be functions that make up the graphics pipeline. By allowing the designer to determine the relevant chunks, this methodology is applicable across application spaces (DSP, graphics, networking, etc.).

To get the cost metrics (performance, etc.) for the software implementation, a compilation can be done. More time can be spent refining the C++ model before compilation if a more accurate software measurement is needed. Also, in many cases, especially DSP and graphics, the software costs (most likely performance) of an optimized design may be known a priori from previous systems.

Hardware cost metrics are obtained by using high-level synthesis on the C++ model to get a range of hardware implementations. The specifics of this process are detailed in the next section. Then, cost metrics can be directly obtained from the resulting RTL or gates.
The hardware and software cost metrics are plugged into the cost function, and the system level partitioning is determined (either by hand or via an automated process with a hill-climbing heuristic). Although the cost function may be the same as in the previous methodology, at a minimum it must be able to support an n-way decision, as compared to the binary decision of hardware vs. software, as there are now multiple candidate hardware implementations.

WWW Research

http://www.eetimes.com/editorial/1998/rtlfeature9807.html

http://www.cse.iitb.ac.in/~ramesh/IT-606-ramesh-L1.ppt#646,2,Models and Tools for Embedded Systems S. Ramesh

http://ls12-www.cs.uni-dortmund.de/~marwedel/kluwer-es-book/es-marw-2f-c.ppt

 

http://www.forteds.com/index.asp?bhcp=1


SystemC Community. http://www.systemc.org/

[ back to top ]


 

Site Search

Powered by Google
Site Map