Copyright (C) 2000  Sony Computer Entertainment Inc.

Basic sample programs for the basic3d VU0 version 
==================================================

Aim in the basic3d VU0 version
--------------------------------------------------------------------------
These sample programs intend to deepen the understanding of the
mathematical functions for a matrix operation and perspective
transformation and mathematical basics related to 3d graphics.  In
addition, those shows the example rewriting to the VU0 macro
instruction as the first step to fast processing.  As a coprocessor
instruction, the VU0 macro instruction is allowed to use as an
assembler format. Also, since that can process maximum four data at a
time, the affinity with a matrix operation and a vector operation is
high.

This basic3d VU0 version intends to deepen the understanding of the use of
the VU0 macro by showing the example converting the matrix operation etc. 
in the Core version using the VU0 macro instruction. And we expect these 
programs will help accelerate the program further when using the VU0 or 
VU1 micro instructions.

Description for the program
--------------------------------------------------------------------------
<Files>
	main.c		Main program
	vu0.c		Matrix operation functions for a coordinate and 
			transparent transformation etc. 
	cube.s		Object data (Cube)
	sjoy.c		Controller utility functions
	sjoy.c		Header file of controller utility functions
	torus.s		Object data (Torus/data divided)
	torus1.s	Object data (Torus/data divided)
	flower.dsm	Texture data
                        
<Starting>
	% make		: Compile
	% make run	: Execute

<Controller operation>
	Up/Down directional button	:Rotation around x-axis (object)
	Right/Left directional button	:Rotation around y-axis (object)
	L1L2 button			:Rotation around z-axis (object)
	
	Triangle/Cross button		:Rotation around the x-axis
					 (viewpoint)
	Square/Circle button		:Rotation around the y-axis
					 (viewpoint)
	R1R2 button			:Translation around the z-axis
					 (viewpoint)

	SELECT button			:Object switching (CUBE/TORUS)

<Specification>
	Display a cube of a Triangle Strip with texture 
	Display a torus of a Triangle with texture 
	24-bit Z buffer
	3 light sources
	32-bit texture

<Outline of the processing>
	1.Load texture
	2.Open the pad
	3.Set the transparent transformation and light source calculation
	  matrix and so on
	4.Execute the transparent transformation and light source 
	  calculation to generate the packet passed to the GS
	6.DMA kick to the GS
	7.Return 2.

Data flow
--------------------------------------------------------------------------
<Initializing process>
Main RAM                                GS built-in RAM(4M)
  Texture data ------------------------>Expansion of the texture data 
  Expansion of the object data
  Expansion of the transparent transformation matrix etc.

<Execution process>
Main RAM
 Object data--------------------------->Expansion of the packet data for 
					the GS transfer (Primitive data 
     	   Transparent transformation/        (STQ,RGBA and XYZF array))
 	   Lighting calculation                            |
	    (VU0core geometry)                             |    
              		                                   |
                        	                           |
GS built-in RAM (4M) <-------------------------------------+
 Writing to the frame and Z buffer      DMA transfer (DMAtag and GIFtag 
 (Rendering)			        are attached.)


Detail description of the program
--------------------------------------------------------------------------
<Variable/Structure/Function> (the function of mathfunc.c is excluded.)

	SampleCubeDataHead[]     : The pointer to the object data (cube)
        SampleTorus1DataHead[]   : The pointer to the object data (torus)
        My_texture1[]            : The pointer to the texture data

   	TexEnv          : The structure for generating the packet to 
			  transfer the ps2_gs_texenv structure
        QWdata          : The union to process 128-bit data easily
        GifPacket       : The structure for generating the packet 
			  transferred to the Gif(GS)
        ObjData         : The structure managing the object data expanded 
			  in the memory

        camera_p        : The position of the camera
        camera_zd       : The eye direction vector of the camera
        camera_yd       : The downward direction vector of the camera
        camera_rot      : The vector for rotation of the camera

        light0          : Light 0
        light1          : Light 1
        light2          : Light 2
        color0          : The color of the light 0
        color1          : The color of the light 1
        color2          : The color of the light 2

	ambient		: Ambient light

        obj_trans       : Translation of the object
        obj_rot         : Rotation of the object

        local_world     : The matrix converting the object Local 
			  coordinate into the World coordinate system
        world_view      : The matrix converting the World coordinate 
			  system into the View coordinate system
        view_screen     : The matrix converting the View coordinate system
			  into the Screen coordinate system
        local_screen    : The matrix converting the Local coordinate into 
			  the Screen coordinate (Product of above 3 
			  matrixes)

        normal_light    : The matrix calculating the inner product between
   		          a normal line of the vertex and the light 
			  direction vector in the World coordinate system
                          (The light direction vector will be set).
	local_light     : The matrix that allows to calculate the inner 
			  product between the normal line of the vertex 
			  and the light direction vector in the Local 
			  coordinate system by multiplying the normal_
			  light matrix by the local_world matrix.
       	light_color     : The matrix produced by multiply-accumulating the
   			  inner product, generated by multiplying the 
			  local_light matrix by the normal line of the 
			  vertex, by each color of light (3 colors plus 
			  ambient light)  
        local_color     : The matrix generated by multiplying local_light 
			  by light_color
			  (It is not actually allowed to use because the 
			  the value of the inner product should be clipped
			  between 0.0 and 1.0 after generating the inner 
			  product vector.)

        work            : The matrix for temporary work

        ps2_gs_dbuff g_d	: The structure for implementing the 
				  double buffer
	ps2_image g_img		: The structure for transferring the 
				  texture into the GS local memory
	TexEnv texenv		: The structure generating the packet for 
				  transferring the texture data

	LoadObj(ObjData *o, u_int *Head)
                        : The function for registering the object data 
			  expanded in the main RAM to the ObjData 
			  structure and operating them.
       	ReleaseObj(ObjData *o)
                        : The function for releasing the registered object
			  data
       	MakePacket(ObjData *obj, int num)
                        : The function for generating the packet that 
			  executes transparent transformation and light 
			  source calculation and transfers the result to 
			  the GS.

<Variables in the main function>
        fd              : The file descriptor receiving a return value 
			  when the pad is opened
	frame		: The switch for flipping the double buffer
	delta		: The variable for setting the angle of rotation
	obj		: The variable for registering the object


Description of the programs (for initializing)
--------------------------------------------------------------------------
Description of the programs (for initializing)
    LoadObj(&obj[0], SampleCubeDataHead);
    LoadObj(&obj[1], SampleTorus1DataHead);

	Read the object data from SampleCubeDataHead and
	SampleTorus1DataHead.

- Initialize the device
    ps2_gs_open();		// open GS dev
    ps2_gs_vc_graphicsmode();	// set virtual console to graphics mode

	In the head of the program, all devices are to used are 
	initialized.

- Set the drawing environment
    ps2_gs_reset(0, g_inter, g_out_mode, g_ff_mode, g_resolution,
		 g_refresh_rate);	// GS
    ps2_gs_set_dbuff(&g_db, g_psm, gp->width, gp->height,
		     (g_zbits == 0) ? 0 : PS2_GS_ZGREATER,
		     g_zpsm, 1);	// ֥Хåե򤪤ʤ
    *(__u64 *)&g_db.clear0.rgbaq = PS2_GS_SETREG_RGBAQ(0x10, 0x10, 0x18, 0x80,
				                       0x3f800000);
    *(__u64 *)&g_db.clear1.rgbaq = PS2_GS_SETREG_RGBAQ(0x10, 0x10, 0x18, 0x80,
						       0x3f800000);

	The program sets the drawing environment for the GS
	
- Image transfer for the texture
    load_teximages(void);
	| ps2_gs_set_image(&g_img, g_textop64, IMAGE_SIZE / 64, PS2_GS_PSMCT32,
	| 		 0, 0, IMAGE_SIZE, IMAGE_SIZE, My_texture1);
	| ps2_gs_load_image(&g_img);

	The function sets the LoadImage information to transfer the 
	texture image into the GS local memory. Then, by executing the 
	ps2_gs_load_image(), specified image will be transferred to the
	GS local memory.
	To support virtual console, it has to be loaded on switching
	virtual console.

- Genarate the packet for setting the environment for the texture
    texenv.size = ps2_gs_set_texenv(&texenv.gs_tex, 0, &g_img, 8, 8, 0, 0,
				    0, 0, 1);
    PS2_GIFTAG_CLEAR_TAG(&texenv.giftag);
    texenv.giftag.NLOOP = texenv.size;
    texenv.giftag.EOP = 1;
    texenv.giftag.PRE = 0;
    texenv.giftag.NREG = 1;
    texenv.giftag.REGS0 = PS2_GIFTAG_REGS_AD;
    *(__u64 *)&texenv.gs_tex.clamp1 = PS2_GS_SETREG_CLAMP(0, 0, 0, 0, 0, 0);

	The function generates the packet for setting the GS general 
	purpose	register TEX0_1 and CLAMP_1 related to the texture using 
	the ps2_gs_texenv().

- Transfer the packet for setting the environment for the texture
    ps2_dma_start_n(g_fd_gs, &texenv.giftag, texenv.size + 1);

	The program transfers the generated packet for setting the texture
	environment by executing DMA.

- Initialize the controller
    sjoy_open();

	Initialize controllers(game pad).  (You can treat them as normal
	joysticks) The data of the controllers are acquired using
	sjoy_get_ps2_button().  The state of the controllers are updated
	by calling sjoy_poll().


Description of the programs (for the main loop)
--------------------------------------------------------------------------
Acquiring the controller information, generating the matrix and generating
the packet(a coordinate and transparent transformation, UV->STQ 
transformation and light source calculation) executed in the main loop 
will be described in the following section.

-Generate the View-Screen matrix
    ps2_vu0_view_screen_matrix(view_screen, 512.0f, 1.0f, gp->pixel_ratio,
			       gp->center_x, gp->center_y,
			       1.0f, (g_zbits == 0) ? 2 : ((1 << g_zbits) - 1),
			       1.0f, 65536.0f);

	The function calculates the View-Screen matrix.  In this
	example, the distance from the viewpoint to the projection
	screen is 512, the aspect ratio of the screen is
	1:gp->pixel_ratio, the central point of the screen is
	(gp->center_x, gp->center_y), the value of the maximum z on Z
	buffer and the range of z is 1.0-65536.0.

- Lock virtual console
    ps2_gs_vc_lock();

	Lock virtual console and prohibit switch to other graphics
	program.  All context on GS/VPU0/VPU1 won't be saved/restored,
	it has to be care when you permit virtual console switching by
	calling ps2_gs_vc_unlock().

- Switch the drawing environment
    ps2_gs_set_half_offset((frame & 1) ? &g_db.draw1 : &g_db.draw0, odev);
    ps2_gs_swap_dbuff(&g_db, frame);

	Prepare drawing environment and swap draw/display buffers.
	For NTSC interlaced mode, prepare env half pixel shifted.

- Acquire the controller information
    // --- read pad ---
    sjoy_poll();
    paddata = sjoy_get_ps2_button(0);
    
    // --- object rotate & change view point ---
    if (paddata & SJOY_PS2_L_DOWN) {
	obj_rot[0] += delta;
	    :	:
	    :	:

	The controller button information is acquired using the
	sjoy_get_ps2_button(). In this example, the object, the angle of
	rotation for both sides of the camera and the shift amount are
	updated based on the acquired data. A left-hand side button for
	the controller is for the object operation while a right-hand
	side button for that is for the camera operation. The select
	button is for object switching.

- Generate the Local-World matrix (rotation only)
    ps2_vu0_unit_matrix(work);                   //Unit matrix 
    ps2_vu0_rot_matrix(local_world, work, rot);  //Rotating matrix 

	The Local-World matrix should be calculated for every object held 
	in the Local coordinate system. The affine transformation, such as
	the rotational and parallel translation, is used. In this sample 
	program, the rotating and parallel translation matrix is 
	calculated and the result from multiplying these two matrixes is 
	used as the Local-World matrix.
	In this sample program, however, the Local_Light matrix should be 
	set before setting the parallel translation (obj_trans) of the 
	third line. If the Local_Light matrix is generated using the 
	matrix including the parallel translation, the ambient light is 
	not correctly calculated. This is because the parallel translation
	and the ambient light are calculated using the forth column of the
	matrix. Therefore, generate the Local_Light matrix immediately 
	after setting the rotating matrix.
	Note that the forth element of the obj_trans should be set to 0.
	Otherwise, very critical value set in [4,4] element of the matrix
	will change.

- Generate the Local_Light matrix
    ps2_vu0_normal_light_matrix( normal_light, light0,light1,light2);
    ps2_vu0_mul_matrix(local_light, normal_light,local_world);

	The function calculates the Normal-Light matrix from the three 
	light sources using the ps2Samp0NormalLightMatrix(). Further, the 
	matrix is multiplied by the Local-World matrix to generate the 
	Local-Light matrix. As mentioned above, in the Local-World matrix 
	used here, do not set the obj_trans.

- Generate the Light_Color matrix
    ps2_vu0_LightColorMatrix( light_color, color0, color1, color2, ambient);

	The function calculates the Light_Color matrix from the light 
	source and environmental color using the 
	ps2_vu0_light_color_matrix(). In the light source calculation, the 
	final vertex color on the screen is determined by multiplying the 
	Light_Color matrix after each lighting effect color is calculated.

- Generate the Local-World matrix (rotational and parallel translation)
    ps2_vu0_trans_matrix(local_world, local_world, obj_trans);

	The function calculates the Light_Color matrix from the light 
	source and environmental color using the 
	ps2Samp0LightColorMatrix(). In the light source calculation, the 
	final vertex color on the screen is determined by multiplying the 
	Light_Color matrix after each lighting effect color is calculated.

- Generate the World-View matrix
    ps2_vu0_rot_camera_matrix(world_view, camera_p, camera_zd,camera_yd,
			      camera_rot);

	The function ps2_vu0_rot_camera_matrix() calls 
	ps2_vu0_camera_matrix() internally. The function calculates 
	World-View matrix using the ps2_vu0_camera_matrix(). The View 
	coordinate system defines the viewpoint (eye) as the origin point 
	(0,0,0), the eye direction as Z+, the rightward direction as X+ 
	and the downward direction as Y+ (right hand coordinate system).
	In this sample, it defines the viewpoint camera_p as (0,0,-25),
	viewpoint camera_zd as (0,0,1) and vertical camera_yd as (0,1,0)
	toward the World and generates the World-View matrix using these 
	vectors. The ps2_vu0_rot_camera_matrix() function internally rotates
	these vectors using the camera_rot in advance. The rotation of the
	camera is enabled by passing the result to the 
	ps2_vu0_camera_matrix().

- Generate the Local-Screen matrix
    ps2_vu0_mul_matrix(work, world_view, local_world);	    //Local-View
    ps2_vu0_mul_matrix(local_screen, view_screen, work);       //Local-Screen

	First, calculate the Local-View matrix using the World-View 
	matrix and the Local-World matrix already calculated. The Local-
	Screen matrix is calculated by multiplying the View-Screen 
	matrix by the Local-World matrix.

- Generate the packet
    MakePacket(&obj[obj_switch], i);

	The packet is generated in the MakePacket() internally. The 
	detail will be described later.

- Packet transfer to the GS
    ps2_dma_start(g_fd_gs, vfd, (ps2_dmatag *)obj[obj_switch].pack[i].buf);

	Transfer the packet of the primitive generated by the MakePacket()
	by executing DMA.

- Unlock virtual console
    ps2_gs_vc_unlock();

	Unlock virtual console to permit other graphics
	applications to be switched.


Description of the programs (for generating the packet)
--------------------------------------------------------------------------
The contents of the packet generation processing in the MakePacket() will
be described in the following sections.

- Initialize the packet
    pack->size = 0;
    pack->buf = (QWdata *)s_spr;

	The writing address of the packet is specified in &s_spr. The
	value that can be set to the QWC during the DMA transfer is
	restricted.  Note that the size of the packet should not exceed
	1MB. The torus data in this sample program are split up into the
	4-KB or so blocks in order not to exceed the capacity of the
	double buffer of the VUMem1 used in the VU1 version.

- Generate the DMAtag and GIFtag    //DMAtag
    // add DMAtag
    pack->buf[pack->size].ul128 = 0;
    dmatag = (ps2_dmatag *)&pack->buf[pack->size].ul128;
    dmatag->ID = PS2_DMATAG_END;
    dmatag->QWC = obj->vertexNum[num] * 3 + 1;
    pack->size++;
    
    // add vertex info(GIFtag, STQ & RGBA & XYZ)
    giftag = (ps2_giftag *)&pack->buf[pack->size++];
    giftag->NLOOP = obj->vertexNum[num];
    giftag->EOP = 1;
    giftag->PRE = 1;
    giftag->PRIM = obj->prim;
    giftag->FLG = PS2_GIFTAG_FLG_PACKED;
    giftag->NREG = 3;
    giftag->REGS0 = PS2_GIFTAG_REGS_ST;
    giftag->REGS1 = PS2_GIFTAG_REGS_RGBAQ;
    giftag->REGS2 = PS2_GIFTAG_REGS_XYZ2;

	In the head of the packet, the DMAtag and GIFtag should be
	attached. The size of the packet that will be sent equals the
	number of the vertex x3 (STQ, RGBA, and XYZF2) plus 1(GIFtag)
	and the ID of the DMAtag is end(PS2_GIFTAG_FLG_PACKED, because
	the number of the DMA packet is only one).  The GIF uses the
	PACKED mode(PS2_GIFTAG_FLG_PACKED) and the primitive is set in
	the GIF tag(obj->prim).  The REG0-2 specify the registers to be
	set.

- Generate STQ, ARGBA and XYZ
    vu0_rot_trans_pers_n_clip_col(&pack->buf[pack->size].ul128, local_screen,
				  vertex,normal, texUV,color, local_light,
				  light_color, obj->vertexNum[num]);

	The function calculates the contents of the packet(STQ, ARGBA and
	XYZ) using the vu0_rot_trans_pers_n_clip_col().

Description of the programs (for VU0 macro instructions)
--------------------------------------------------------------------------
The processing of the vu0_rot_trans_pers_n_clip_col() will be described in 
the following section. This is the main part of these sample programs
since the function executes the coordinate and transparent transformation
and the light source calculation using the VU0 macro instructions.
See vu0.c.

- Set the matrixes
    Set the Local-World matrix
    lqc2	vf4,0x0(%1)	#set local_world matrix[0]
    lqc2	vf5,0x10(%1)	#set local_world matrix[1]
    lqc2	vf6,0x20(%1)	#set local_world matrix[2]
    lqc2	vf7,0x30(%1)	#set local_world matrix[3]
    Set the Local-Light matrix
    lqc2	$vf17,0x0(%6)	#set local_light matirix[0]
    lqc2	$vf18,0x10(%6)	#set local_light matirix[1]
    lqc2	$vf19,0x20(%6)	#set local_light matirix[2]
    Set the Light-Color matrix
    lqc2	$vf21,0x0(%7)	#set light_color matrix[0]
    lqc2	$vf22,0x10(%7)	#set light_color matrix[1]	
    lqc2	$vf23,0x20(%7)	#set light_color matrix[2]
    lqc2	$vf20,0x30(%7)	#set light_color matrix[3]		

	First, set the Local-World matrix, Normal-Light matrix and Light-
	Color matrix in the VF register of the VU0.

- Read the vertex, normal line, color of vertex and ST
    lqc2	vf8,0x0(%2)	#load XYZ
    lqc2	$vf24,0x0(%4)	#load NORMAL
    lqc2	$vf25,0x0(%5)	#load COLOR
    lqc2	$vf27,0x0(%8)	#load ST

	Read the coordinate of the vertex, normal line, color of the 
	vertex and the coordinate of the texture necessary to generate
	the packet into the register.

- Coordinate transformation
    # (X0,Y0,Z0,W0)=[SCREEN/LOCAL]*(X,Y,Z,1)
    vmulax.xyzw     ACC, vf4,vf8
    vmadday.xyzw    ACC, vf5,vf8
    vmaddaz.xyzw    ACC, vf6,vf8
    vmaddw.xyzw     vf12,vf7,vf8

	Calculate the screen coordinate by multiplying the Local-Screen
	matrix by the vertex coordinate. In this case, the value W is the 
	same as the value Z in the View coordinate system.

- Transparent transformation
    # (X1,Y1,Z1,1)=(X0/W0,Y0/W0,Z0/W0,W0/W0)
    vdiv    Q,vf0w,vf12w
    vwaitq
    vmulq.xyzw	vf12,vf12,Q
    vftoi4.xyzw	vf13,vf12

	Multiply the screen coordinate by 1/w. Then convert it into the 
	fixed point value for the GIF packet. 1/w used here is kept for
	multiplying by the texture coordinate later.

- Calculate the lighting effect
    # (L1,L2,L3)=[LLM](Nx,Ny,Nz)
    # LLM: Local light matrix
    # L1,L2,L3: Lighting effect
    # Nx,Ny,Nz: Normal line vector
    vmulax.xyzw    ACC, $vf17,$vf24
    vmadday.xyzw   ACC, $vf18,$vf24
    vmaddz.xyzw    $vf24,$vf19,$vf24
    vmaxx.xyz      $vf24,$vf24,$vf0 # Set a negative value to 0

	Calculate the lighting effect by multiplying the local light 
	matrix by the normal line vector. If the direction of the light
	source and the normal line vector is the same, the result takes a 
	negative value. The lighting effect is, therefore, saturated with
	0.	

- Calculate the lighting effect color
    # (LTr,LTg,LTb,LTw)=[LCM](L1,L2,L3,1)
    # LCM: Light color matrix
    # LTr,LTg,LTb: Lighting effect color
    vmulax.xyzw    ACC, $vf21,$vf24
    vmadday.xyzw   ACC, $vf22,$vf24
    vmaddaz.xyzw   ACC, $vf23,$vf24
    vmaddw.xyzw    $vf24,$vf20,$vf0	

	Calculate the lighting effect color by multiplying the light color
	matrix by the lighting effect.

- Calculate the color of the vertex on the screen
    # (RR,GG,BB) = (R,G,B)*(LTr,LTg,LTb)
    # R,G,B: Color of the vertex
    # RR,GG,BB: Color of the vertex on the screen
    vmul.xyzw	$vf26,$vf24,$vf25
    # [0..255] Saturation
    vmaxx.xyz	$vf26,$vf26,$vf0
    lui		$2,0x437f
    ctc2	$2,$vi21
    vnop
    vnop
    vminii.xyz	$vf26,$vf26,I
    vftoi0.xyzw	$vf26,$vf26

	Calculate the color of the vertex on the screen by multiplying the
	lighting effect color by the color of the vertex. The resulting
	value may be saturated because its range is restricted from 0 to 
	255. Further, it is converted into the fixed point value for the 
	GIF packet.

- Calculate ST
    # (S,T,Q) = (s,t,1)/w
    vmulq.xyz	$vf28,$vf27,Q

	Multiply the value of 1/w kept in the transparent transformation
	for a perspective correction.

- Save the value of STQ, RGBA and XYZ
    sqc2	$vf28,0x0(%0)		#store STQ 
    addi	%0,0x10
    sqc2	$vf26,0x0(%0)		#store RGBA 
    addi	%0,0x10
    sqc2	vf13,0x0(%0)		#store XYZ 
    addi	%0,0x10
    #
    addi	%3,-1
    addi	%2,0x10
    addi	%4,0x10
    addi	%5,0x10
    addi	%8,0x10
    bne		$0,%3,_rotTPNCC_loop

	Save the value of STQ, RGBA and XYZ into the packet in accordance 
	with the order set in the GIFtag. Note that if transferring with
	GIF, the order is important since the value of Q in the RGBAQ 
	instruction is the value saved in a internal register immediately
 	before the ST instruction.
	The main loop of the vu0_rot_trans_pers_n_clip_col() is finished in
	this process. Repeat the processing after the reading of the 
	vertex data until reaching the specified number of the vertex with
	incrementing each address. 
	
