單片機外文翻譯-基于MCS-51單片機結(jié)構(gòu)中16MB尋址空間的設(shè)計【中文4345字】【PDF+中文WOrd
單片機外文翻譯-基于MCS-51單片機結(jié)構(gòu)中16MB尋址空間的設(shè)計【中文4345字】【PDF+中文WOrd,中文4345字,單片機,外文,翻譯,基于,MCS,51,結(jié)構(gòu),16,MB,尋址,空間,設(shè)計,中文,4345,PDF,WOrd
Design of 16MB Addressing Spaces in an MCU Based on the MCS-51 Structure
Authorized licensed use limited to: Guangdong Univ of Tech. Downloaded on May 22,2010 at 11:44:47 UTC from IEEE Xplore. Restrictions apply.
Abstract:
JING Wei-liang, HU Yue-li, CAO Jia-lin Microelectronic Research and Development Center, Shanghai University
NO.149, Yanchang Road, Shanghai 200072, China Email:huyueli@shumchip.com, Phone:086-021-56331271
algorithms of image processing. If the addressing spaces
This paper presents the design of 16M-byte addressing spaces in an MCU (Micro-control Unit) based on the structure of MCS-51. Both external data and program storage capacities are enlarged from 64K to 16M bytes without the increment of time-multiplexed address buses and the change of the whole instruction set. Thereby, a large number of data and programs with complex algorithms can be processed and run on this MCU. After the successful simulation at the front end, practical experiments made on a Nios development board verify that the design is correct. Furthermore, the design gains good effects in practical applications and enables the MCU to control an image processing system with much more than 64KB programs and tremendous data.
Index terms: MCS-51, MCU, Program Counter, Program Address, Addressing space.
1. Introduction
A SCMP (Single-Chip MultiProcessor) structure is employed in our “Skin Diagnoses” machine vision chip. Each slave processor has its own responsibility, including common algorithms of machine vision, database and knowledge-base management and etc, without users’ participating in, while a master processor faces users directly. The instruction set is completely compatible with that of MCS-51 so that it is the same as executing MCS-51 instructions for users to process certain image processing algorithms. System application engineers needn’t to learn professional knowledges of image processing, machine vision training and programming to achieve some functions concerned, so it is very easy to use machine vision technologies in all kinds of MCU fields.
To achieve the objective described above, an MCU, able to control and handle intricate computations, is required. And it is the core and basic part of the machine vision chip. As is known to all, MCS-51 has a long history of usage, a variety of application fields and plentiful supplies of third-party software and simulation tools. Therefore, an MCU with the structure of MCS-51 is designed. But the data space and the program capacity of MCS-51 are only 64KB respectively[1]. Such sizes cannot meet the requirements of tremendous data and complex
Foundation item: Project supported by the Science Foundation of Shanghai Municipal Commission of Education (Grant No.03AK16); Key Technology Foundation of Shanghai Municipal Commission of Science and Technology (Grant No.025911323 ).
Communication author: HU Yue-li.
of 51MCU can be enlarged without changing the
instruction set and the definitions of ports or pins, the performance of MCUs with the MCS-51 structure will be largely improved and enhanced. The paper puts forward a time-multiplexed method and implements the design of both 16MB data and 16MB program addressing spaces.
2. Design of 16MB addressing spaces 2.1Ideas
2.1.1 Data Memory
A 8-bit SFR (Special Function Register) ‘DPTR_PAGE’ is added as the page address of an external 16MB data memory in order to enable the MCU to access this 16MB space. Each page size is 64KB and the 16MB data memory is divided into 256(28) pages. The addressing method in every page is the same as that of a standard MCS-51. The page address is sent out by PORT2 as the highest 8 bits. The middle 8-bit address and the low 8-bit address are sent out by PORT2 and PORT0 just as a standard 8051 transmits its 16-bit address.
Fig.1 External Data Memory connected to an MCU
The connecting method of the MCU and its external data memory is shown in Fig.1. Address Latch 1 latches the low 8-bit address and Address Latch 2 latches the high 8-bit address. The middle 8-bit address is sent out after the high 8 bits are latched.
2.1.2 Program Memory
To enable an MCU to access a 16MB program memory, two design ideas are conceived of at the very beginning:
One is to change the instruction set. For example, original LCALL/LJMP addr16 instructions have 16-bit operands. We can change them to LCALL/LJMP addr24 instructions with 24-bit operands and enable a PA(Program Address) to jump to another program module or to call another sub-module freely in 16MB addressing spaces. This modification is relatively easy with regard to design issues, but standard MCS-51 compilers must be modified or even redesigned.
Otherwise, the original instruction set in standard compilers cannot be decoded correctly. Thus, MCUs with such a modified instruction set are lack of compatibility for usage and not beneficial to their spread and application.
The other one is to divide a program memory logically without changing any MCS-51 instruction. The space of an addressable 16MB program storage comprises 256 blocks and each block address determines the highest 8 bits of a 24-bit address. Any particular address in each 64KB block consists of the middle 8 bits and the low 8 bits of a 24-bit PA. A PC (Program Counter) keeps the low 16 bits of a PA.
Fig.2 Structure of Program Memory Spaces
To make the MCU to be designed have better compatibility, we adopt the second means by which the standard MCS-51 instruction set is not changed and additional 8-bit address lines are not added. Fig.2 shows the structure of extended program memory addressing spaces.
2.2 16MB addressing spaces of a Data Memory
2.2.1 Data Pointer Register (DPTR)
A data pointer is composed of three 8-bit registers, namely DPTR_PAGE, DPTR_LOW and DPTR_HIGH.
The address of DPTR_PAGE is 95H and this register can be customized by users. It stores the high 8 bits of a 24-bit PA.
The addresses of DPTR_LOW/DPTR_HIGH are 82H and 83H. And both of these two registers can also be customized by users. They keep the low 8 bits and the middle 8 bits of a 24-bit PA.
2.2.2 Automatic Detecting Logic
This work proposes a cross-page approach implemented with an automatic detecting logic.
A ‘DEC DPTR’ instruction, whose address is A5H, is added to the MCU. When the MCU accesses an external data memory, INC DPTR/DEC DPTR instructions may cause the value of the low 16 bits of DPTR to be changed from FFFFH to 0000H or from 0000H to FFFFH and a cross-page situation may take place. An automatic detecting logic will increase or decrease the content of DPTR_PAGE by one under such situations and thus the data can be correctly written into or read out of a previous or next page of a 16MB data memory. Some parts of the codes implementing the cross-page function are shown in Fig.3. PC_CON[0] is a decoding signal of the ‘INC_DPTR’ instruction and PC_CON[1] is a decoding
signal of the ‘DEC_DPTR’ instruction. CCLK signal is a working clock of the MCU.
Fig.3 Partial codes of an Automatic Detecting Logic
2.3 16MB addressing spaces of a Program Memory
In the MCU, PC/PA/PRO_BLOCK/DPTR control modules are designed to compute and store the address of the next instruction storage unit. The generation of a 8-bit block address, the formation of a 24-bit PA and the operation of a 16-bit PC and DPTR are controlled and managed by these four parts.
2.3.1 PA (Program Address) & PC (Program Counter) To accelerate the processing rate of the MCU, a Prefetching Technique is employed. The 24-bit address of the next instruction is read out at the third phase during the last machine cycle of a current instruction. Each internal unit address of a program memory comes from a PA according to which all addresses of both instruction operation codes and operands are transmitted and stored.
When the MCU is reset or supplied with power, the initial value of a 24-bit PA is 000000H.
A program counter (PC) is a 16-bit special register and can enable the MCU to address a 64KB space. When the MCU is reset or supplied with power, the initial value of a 16-bit PC is 0000H. The PC register is independent and cannot be accessible to users[2].
The low 16 bits of a PA are the same as the whole 16 bits of a PC except when MOVCS instructions are processed. For example, as the ‘MOVC A, @A+PC’ or ‘MOVC A, @A+DPTR’ instruction is being run, the PC doesn’t change, while the content of the PA must be changed to the value of (A+PC) or (A+DPTR) so that the MCU can read data at the updated address and assign them to the Accumulator. So the control logic circuit of a PC differs a little from that of a PA.
2.3.2 Block Address Generator
The block address generator is responsible for generating the high 8 bits of a PA, so whether the sequential design of block address sources is correct or not directly determines whether programs can be run correctly or not. The address of blocks has three sources
except the default value ‘00H’: the content users can customize; the data from an external or internal PA stack after RET/RETI instructions are processed; the updated block address after programs jump relatively across adjacent blocks. See Fig.4: Structure of a Block Address Generator.
Fig.4 Structure of a Block Address Generator
2.3.2.1 PRO_BLOCK SFR
Users should define the address of certain block before the LJMP/LCALL instructions are processed in order to jump across blocks or to call sub-modules located in other blocks. The detailed method of achieving this target is to add a new SFR called PRO_BLOCK in the MCU. Its address is FFH. And its value will be set to 00H after the system is reset. Users can write a 8-bit block address into this SFR before running LJMP/LCALL instructions. In Fig.4, PRO_BLOCK_W is a write-enable signal and REG_RESULT stores the block address customized by users.
2.3.2.2 Block address after Interrupt/Calling
As is shown in Fig.4, XRAMDI stores the value of a block address to be popped out from the stack of an external data memory while SOURCE_DI stores the value of a block address to be popped out from the stack of an internal data memory. At the end of RET/RETI instructions, block addresses originally stored in an external or internal memory are popped out from either stack and transmitted to the PRO_BLOCK SFR.
2.3.2.3 Automatic Increment of a Block address
In Fig.4, the next block address value (PRO_BLOCK
+ 1) is stored in ROM_0. When programs run orderly through two adjacent blocks, the value of the PRO_BLOCK SFR is automatically updated to that of ROM_0 and the MCU can correctly fetch the next operation code or operand at the beginning of the next block.
2.3.2.4 Block address after Relative Jumping
The previous block address value (PRO_BLOCK - 1) is stored in ROM_1 shown in Fig.4. When programs jump relatively to a previous block, the value of the PRO_BLOCK SFR is automatically updated to that of
ROM_1. And when programs jump relatively to the next block, the value of PRO_BLOCK is changed to ROM_0.
3. Simulation & Verification
3.1 1 Simulation on a workstation
The soft core is simulated relatively thoroughly on a workstation (SUNW, Ultra-60; sparc; sun4u). The simulation tool is Verilog-XL (Cadence). A part of assembly language codes testing CJNE instructions is described in Fig.5. The objective of this test_bench is to verify that the design of 16MB program addressing spaces is correct. When certain program runs orderly through two adjacent blocks, the value of PRO_BLOCK can increase by one automatically. And any program can jump relatively in every block or across blocks not only freely but also correctly.
Fig.5 Testing codes of CJNE instructions
The waveforms of testing results are shown below. STATE is a signal of the MCU state machine and PROGA is the same as a 24-bit PA.
Fig.6 CPL 0D3H
Fig.6 is the beginning waveform of the testing codes. PORT0 transmits data and the low 8 bits of a PA every other phase. ‘B2 D3’ is the machine code of ‘CPL 0D3H’.
Fig.7 CJNE @ R0, #01H, ADD0
In Fig.7, ‘B6 01 66’ is the machine code of ‘CJNE
@R0, #01H, ADD0’. At the end of this instruction, PROGA shows the address of ADD0 ‘1900C4H’ and ‘MOV P1, 0D0H’ will be processed.
Fig.8 MOV P1, 11H
In Fig.8, ‘B5 11 90’ is the machine code of ‘MOV P1, 11H’. As instructions such as those jumping or calling ones are not processed, PROGA increases by one automatically from BLOCK_19 to BLOCK_1A.
Fig.9 CJNE @R1, #0E7H, ADD4
In Fig.9, ‘B7 E7 F2’ is the machine code of ‘CJNE
@R1, #0E7H, ADD4’. At the end of this instruction, PRO_BLOCK decreases by one automatically from BLOCK_1A to BLOCK_19. Jumping relatively across adjacent blocks is achieved.
3.2 FPGA Verification
The soft core is also verified on an Altera Nios FPGA development board. One certain testing environment is described as below:
External interrupt source zero (IT0) is set to have priority over external interrupt source one (IT1). The valid signal of IT1 appears before that of IT0 while the main program is running. And as the IT1 sub-module is being processed, the valid signal of IT0 takes place. IT1 has a lower priority, so its interrupt sub-module will be suspended and the interrupt sub-module of IT0 will be executed. When the sub-module of IT0 is finished, the PA will be changed to where the IT1 sub-module has been stopped and the module will continue to be processed until the end of this sub-module. By that time, the main program will have been activated again. IT0 & IT1 is supplied by two ping-pang keys on the Nios board. The function of the IT1 sub-module is to make LED show number ‘1’ orderly M times while the IT0 sub-module lets LED show number ‘0’ orderly N times (M>>N). The function of the main program is to activate a buzzer on the board. The addresses of interrupt entrances are both located at block_0 and these three modules are put in different blocks.
After the design passes the compiling and simulation on QUARTUS (Altera), the RTL(Register Transfer Level) codes of the MCU are programmed into FPGA and the testing machine codes are put into a FLASH memory on the board.
The objective of this experiment is to verify that the program pointer can return to BLOCK_0 and the 24-bit PA can be correctly pushed into or popped out of a stack. The results of the experiment are the same as what we have expected and the design achieving this function is proved to be correct.
4. Conclusions
An MCU with both 16MB data and 16MB program addressing spaces is designed based on the MCS-51 structure. The instruction set is totally compatible with that of standard 8051, so those tools and thirty-party software that support the 51 series can all be used. The design is verified to be correct after the simulation on a workstation and the verification through FPGA. As an IP (Intellectual Property) core, the MCU has been embedded into an image processing system chip with the structure of SCMP (Single-Chip Multiprocessor). The advantage of low cost, powerful functions, good compatibility and huge addressing spaces enables the MCU to have a very wide scale of application fields.
Acknowledgements
The authors acknowledge contributions from the whole desgin team, including logic, synthesis, Placement&Route, CAD and test product engineers.
References
[1] Intel Corporation Microcontroller Handbook, MCS-51 Family, Intel Corporation (1984), pp. 6-8.
[2] Myke Predko, Programming And Customizing The 8051 Microcontroller, McGraw-Hill (1999), pp. 7-8.
JING Wei-liang was born in Shanghai, China, in 1980. He received the B.E degree in electrical engineering from Shanghai University, Shanghai, China in 2003.
He is currently studying in Shanghai University for the M.E degree in microelectronics and solid-state electronics. His current research interest is the design of an MCU with high performance and an image processing system with a SCMP structure.
(beastzhener@hotmail.com; 086-13916436733)
HU Yue-li received the B.S degree in applied physics and the
M.E degree in electrical engineering from Shanghai University of Technology, Shanghai, China in 1982 and 1989, respectively.
He is an associate professor and currently working in Shanghai University Microelectronic R&D Center. His research interests include digital IC designs, Image Processing and Machine Vision.
CAO Jia-lin is the dean of Shanghai University Microelectronic R&D Center and the vice president of Shanghai University.
As a professor, his research interests include pattern recognition and VLSI circuits.
收藏