I have wanted to create some code that utilizes a Windows
Communication Foundation service for quite some time. This blog will
introduce a “remote desktop” service. The purpose is to create something
that may have a use later and at the same time keep it simple enough to
be a learning platform. In no way do I envision this being a viable
remote desktop competitor against Microsoft Remote Desktop, VNC, or the
many other mature technologies. The goal of the project is to be able to
use a client (WinForm in this case) to remotely view the desktop (and
maybe later control) of a computer hosting the WCF service.
Capturing Desktop Activity
Capturing the activity on the desktop is fairly straight-forward. However, if you want to optimize the performance then you must use some tricks. A key performance metric for remote desktop applications is remote screen refresh rate. Two areas that I will concentrate on to increase refresh rates are:
This
class exposes two public methods: Cursor and Screen. I want to be able
to capture both the desktop surface and the mouse as it moves over the
desktop. I have separated the methods to capture this data. It seems
reasonable that I would be able to get a better user experience by
refreshing the cursor faster and the screen a bit slower. So this class
allows me to run different refresh rates for those two.
Here is the code for capturing the cursor and the screen:
First I must tell you that I am using Rashid Mahmood’s code to
capture the actual bitmaps of the screen and cursor. Rashid’s code can
be found here.
He provides a wrappers around a number of WIN32 API. For performance, I
want to be as close to the Operating System as possible on these calls
to minimize the number of wrappers. All the static methods in the
CaptureScreen class are provided by Rashid. Please check out the link
for more information.
To minimize the amount of network traffic, I only want to return data when pixels have changed. To accomplish this the Screen method keeps the previously captured screenshot for comparison. It then utilizes a “GetBoundingBoxForChanges” method to determine the minimal rectangle that encompasses all the changed pixels. This rectangle is then used to generate a smaller bitmap that is a portion of the full screen for transfer over the wire. I searched for WIN32 API to get this bounding box, but did not find one.
The “GetBoundingBoxForChanges” routine is shown below. It is fairly well commented. The algorithm uses a few tricks to increase performance.
The first trick is to not use the “GetPixel” method that is available
on the Bitmap class. This method is too slow for our purposes. Instead
we access the bitmap data using pointers. This requires a bit of
“unsafe” code. Each pixel has 4 bytes that provide the red, green, blue,
and alpha channel values (RGBA). The algorithm reads all 4 bytes at
once by using int pointers.
The algorithm uses a two pass approach for determining the bounding box. The first pass searches from the top to the bottom while scanning from left to right for changed pixels. The number of pixels scanned is adapted as we go to optimize the search time. The result of the first pass is that we now know the top & left boundaries and have initialized the bottom and right boundaries. The second pass searches from bottom to the top while scanning from right to left for changed pixels. Again we adapt the scan width as we go. After this pass we know the bottom and right boundaries.
I am sure there are better ways to achieve a faster algorithm and minimize the amount of data. If you have suggestions, please leave a comment.
WCF Server & Host
To create the WCF server, I first created the service contract. The following is a simple contract that provides update of desktop activity. Again, I separated the screen and cursor updates with the intent on having different refresh rates.
The implementation of this service is shown by the following code:
Notice that the service methods are returning a byte array. I
selected this format because I wanted to pack data into a structure
before sending and then unpack on the other side. This packing is using
the static Utils class methods. This class needs to be cleaned up a bit.
For now it serves as a holder for methods that I know will be re-used.
The following code shows one of the packing methods.
Pretty straightforward. Notice that I am using JPEG compression on
what is left of the image before packing it into the byte array. There
are similar routines that unpack the data on the other side. There may
be a better way to do this marshalling. If you have ideas, please let me
know.
For now I am hosting the WCF service in a console application. The code for the host is simple enough:
When you start the host you see the following:
WinForm Client
For the client, I created a simple WinForms application. In the Visual Studio designer it looks like the following:
The dashed rectangle is a picture box. The two text boxes at the bottom are the refresh rate in milliseconds for the screen and cursor. The apply button allows the refresh rates to be updated after the application has been started. The two elements that you cannot see are two timers that are used to trigger the updates for the screen and the cursor.
The first step in connecting to the WCF service is to create a service reference. You just need to fire up the service and let the WSDL to the work. You will then have a proxy for accessing the service methods we defined earlier.
The following code is the implementation for the triggers for the two timers:
Notice each method uses the WCF service proxy (svc) to remotely call
the service method. Then the returned data is unpacked using another
method in the “Utils” class. Finally the UI is updated to show the new
data. During this update, the screen image and the cursor image are
merged together.
The following shows a screenshot of the client running. Notice that it is capturing itself over and over.
Results
The results so far are decent. When there is not activity on the remote machine, the server does not transfer any data to the client. This is shown below in the server console window.
During activity the number of bytes transferred adapts to the number of pixels changed.
As can be seen in the screens above, the refresh rates range from 2-10 frames per second depending on the number of pixels changed. Take these numbers with some skepticism, since these test were run on my internal network (100MB/s).
I have some more road to travel with this concept before I post the project. I will also try to post a video of the desktop viewer at some point. If you have any suggestions on improvements, please leave a comment.
Capturing Desktop Activity
Capturing the activity on the desktop is fairly straight-forward. However, if you want to optimize the performance then you must use some tricks. A key performance metric for remote desktop applications is remote screen refresh rate. Two areas that I will concentrate on to increase refresh rates are:
- The algorithm used to generate the screen capture must be fast.
- The amount of data shuttled across the network must be minimized.
Here is the code for capturing the cursor and the screen:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
| public Bitmap Screen(ref Rectangle bounds){ // Capture a new screenshot. // _newBitmap = CaptureScreen.CaptureDesktop(); // If we have a previous screenshot, only send back // a subset that is the minimum rectangular area // that encompasses all the changed pixels. // if (_prevBitmap != null) { // Get the bounding box. // bounds = GetBoundingBoxForChanges(); if (bounds == Rectangle.Empty) { // Nothing has changed. // return null; } // Get the minimum rectangular area // Bitmap diff = new Bitmap(bounds.Width, bounds.Height); Graphics g = Graphics.FromImage(diff); g.DrawImage(_newBitmap, 0, 0, bounds, GraphicsUnit.Pixel); g.Dispose(); // Set the current bitmap as the previous to prepare // for the next screen capture. // _prevBitmap = _newBitmap; return diff; } // We don't have a previous screen capture. Therefore // we need to send back the whole screen this time. // else { // Set the previous bitmap to the current to prepare // for the next screen capture. // _prevBitmap = _newBitmap; // Create a bounding rectangle. // bounds = new Rectangle(0, 0, _newBitmap.Width, _newBitmap.Height); return _newBitmap; }}public Bitmap Cursor(ref int cursorX, ref int cursorY){ if (_newBitmap == null) { return null; } else { Bitmap img = CaptureScreen.CaptureCursor(ref cursorX, ref cursorY); if (img!=null && cursorX < _newBitmap.Width && cursorY < _newBitmap.Height) { return img; } else { return null; } }} |
To minimize the amount of network traffic, I only want to return data when pixels have changed. To accomplish this the Screen method keeps the previously captured screenshot for comparison. It then utilizes a “GetBoundingBoxForChanges” method to determine the minimal rectangle that encompasses all the changed pixels. This rectangle is then used to generate a smaller bitmap that is a portion of the full screen for transfer over the wire. I searched for WIN32 API to get this bounding box, but did not find one.
The “GetBoundingBoxForChanges” routine is shown below. It is fairly well commented. The algorithm uses a few tricks to increase performance.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
| private Rectangle GetBoundingBoxForChanges(){ // The search algorithm starts by looking // for the top and left bounds. The search // starts in the upper-left corner and scans // left to right and then top to bottom. It uses // an adaptive approach on the pixels it // searches. Another pass is looks for the // lower and right bounds. The search starts // in the lower-right corner and scans right // to left and then bottom to top. Again, an // adaptive approach on the search area is used. // // Note: The GetPixel member of the Bitmap class // is too slow for this purpose. This is a good // case of using unsafe code to access pointers // to increase the speed. // // Validate the images are the same shape and type. // if (_prevBitmap.Width != _newBitmap.Width || _prevBitmap.Height != _newBitmap.Height || _prevBitmap.PixelFormat != _newBitmap.PixelFormat) { // Not the same shape...can't do the search. // return Rectangle.Empty; } // Init the search parameters. // int width = _newBitmap.Width; int height = _newBitmap.Height; int left = width; int right = 0; int top = height; int bottom = 0; BitmapData bmNewData = null; BitmapData bmPrevData = null; try { // Lock the bits into memory. // bmNewData = _newBitmap.LockBits( new Rectangle(0, 0, _newBitmap.Width, _newBitmap.Height), ImageLockMode.ReadOnly, _newBitmap.PixelFormat); bmPrevData = _prevBitmap.LockBits( new Rectangle(0, 0, _prevBitmap.Width, _prevBitmap.Height), ImageLockMode.ReadOnly, _prevBitmap.PixelFormat); // The images are ARGB (4 bytes) // int numBytesPerPixel = 4; // Get the number of integers (4 bytes) in each row // of the image. // int strideNew = bmNewData.Stride / numBytesPerPixel; int stridePrev = bmPrevData.Stride / numBytesPerPixel; // Get a pointer to the first pixel. // // Note: Another speed up implemented is that I don't // need the ARGB elements. I am only trying to detect // change. So this algorithm reads the 4 bytes as an // integer and compares the two numbers. // System.IntPtr scanNew0 = bmNewData.Scan0; System.IntPtr scanPrev0 = bmPrevData.Scan0; // Enter the unsafe code. // unsafe { // Cast the safe pointers into unsafe pointers. // int* pNew = (int*)(void*)scanNew0; int* pPrev = (int*)(void*)scanPrev0; // First Pass - Find the left and top bounds // of the minimum bounding rectangle. Adapt the // number of pixels scanned from left to right so // we only scan up to the current bound. We also // initialize the bottom & right. This helps optimize // the second pass. // // For all rows of pixels (top to bottom) // for (int y = 0; y < _newBitmap.Height; ++y) { // For pixels up to the current bound (left to right) // for (int x = 0; x < left; ++x) { // Use pointer arithmetic to index the // next pixel in this row. // if ((pNew + x)[0] != (pPrev + x)[0]) { // Found a change. // if (x < left) { left = x; } if (x > right) { right = x; } if (y < top) { top = y; } if (y > bottom) { bottom = y; } } } // Move the pointers to the next row. // pNew += strideNew; pPrev += stridePrev; } // If we did not find any changed pixels // then no need to do a second pass. // if (left != width) { // Second Pass - The first pass found at // least one different pixel and has set // the left & top bounds. In addition, the // right & bottom bounds have been initialized. // Adapt the number of pixels scanned from right // to left so we only scan up to the current bound. // In addition, there is no need to scan past // the top bound. // // Set the pointers to the first element of the // bottom row. // pNew = (int*)(void*)scanNew0; pPrev = (int*)(void*)scanPrev0; pNew += (_newBitmap.Height - 1) * strideNew; pPrev += (_prevBitmap.Height - 1) * stridePrev; // For each row (bottom to top) // for (int y = _newBitmap.Height - 1; y > top; y--) { // For each column (right to left) // for (int x = _newBitmap.Width - 1; x > right; x--) { // Use pointer arithmetic to index the // next pixel in this row. // if ((pNew + x)[0] != (pPrev + x)[0]) { // Found a change. // if (x > right) { right = x; } if (y > bottom) { bottom = y; } } } // Move up one row. // pNew -= strideNew; pPrev -= stridePrev; } } } } catch (Exception ex) { int xxx = 0; } finally { // Unlock the bits of the image. // if (bmNewData != null) { _newBitmap.UnlockBits(bmNewData); } if (bmPrevData != null) { _prevBitmap.UnlockBits(bmPrevData); } } // Validate we found a bounding box. If not // return an empty rectangle. // int diffImgWidth = right - left + 1; int diffImgHeight = bottom - top + 1; if (diffImgHeight < 0 || diffImgWidth < 0) { // Nothing changed return Rectangle.Empty; } // Return the bounding box. // return new Rectangle(left, top, diffImgWidth, diffImgHeight);} |
The algorithm uses a two pass approach for determining the bounding box. The first pass searches from the top to the bottom while scanning from left to right for changed pixels. The number of pixels scanned is adapted as we go to optimize the search time. The result of the first pass is that we now know the top & left boundaries and have initialized the bottom and right boundaries. The second pass searches from bottom to the top while scanning from right to left for changed pixels. Again we adapt the scan width as we go. After this pass we know the bottom and right boundaries.
I am sure there are better ways to achieve a faster algorithm and minimize the amount of data. If you have suggestions, please leave a comment.
WCF Server & Host
To create the WCF server, I first created the service contract. The following is a simple contract that provides update of desktop activity. Again, I separated the screen and cursor updates with the intent on having different refresh rates.
1
2
3
4
5
6
7
8
9
| [ServiceContract(SessionMode=SessionMode.Required)]public interface IRemoteDesktop{ [OperationContract] byte[] UpdateScreenImage(); [OperationContract] byte[] UpdateCursorImage();} |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
| public class RemoteDesktopService : IRemoteDesktop{ // An instance of the screen capture class. // private ScreenCapture capture = new ScreenCapture(); /// <summary> /// Capture the screen image and return bytes. /// </summary> /// <returns>4 ints [top,bot,left,right] (16 bytes) + image data bytes</returns> public byte[] UpdateScreenImage() { // Capture minimally sized image that encompasses // all the changed pixels. // Rectangle bounds = new Rectangle(); Bitmap img = capture.Screen(ref bounds); if (img != null) { // Something changed. // byte[] result = Utils.PackScreenCaptureData(img, bounds); // Log to the console. // Console.WriteLine(DateTime.Now.ToString() + " Screen Capture - {0} bytes", result.Length); return result; } else { // Nothing changed. // // Log to the console. Console.WriteLine(DateTime.Now.ToString() + " Screen Capture - {0} bytes", 0); return null; } } /// <summary> /// Capture the cursor data. /// </summary> /// <returns>2 ints [x,y] (8 bytes) + image bytes</returns> public byte[] UpdateCursorImage() { // Get the cursor bitmap. // int cursorX = 0; int cursorY = 0; Bitmap img = capture.Cursor(ref cursorX, ref cursorY); if (img != null) { // Something changed. // byte[] result = Utils.PackCursorCaptureData(img, cursorX, cursorY); // Log to the console. // Console.WriteLine(DateTime.Now.ToString() + " Cursor Capture - {0} bytes", result.Length); return result; } else { // Nothing changed. // // Log to the console. // Console.WriteLine(DateTime.Now.ToString() + " Cursor Capture - {0} bytes", 0); return null; } }} |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
| public static byte[] PackScreenCaptureData(Image image, Rectangle bounds){ // Pack the image data into a byte stream to // be transferred over the wire. // // Get the bytes of the image data. // Note: We are using JPEG compression. // byte[] imgData; using (MemoryStream ms = new MemoryStream()) { image.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg); imgData = ms.ToArray(); } // Get the bytes that describe the bounding // rectangle. // byte[] topData = BitConverter.GetBytes(bounds.Top); byte[] botData = BitConverter.GetBytes(bounds.Bottom); byte[] leftData = BitConverter.GetBytes(bounds.Left); byte[] rightData = BitConverter.GetBytes(bounds.Right); // Create the final byte stream. // Note: We are streaming back both the bounding // rectangle and the image data. // int sizeOfInt = topData.Length; byte[] result = new byte[imgData.Length + 4 * sizeOfInt]; Array.Copy(topData, 0, result, 0, topData.Length); Array.Copy(botData, 0, result, sizeOfInt, botData.Length); Array.Copy(leftData, 0, result, 2 * sizeOfInt, leftData.Length); Array.Copy(rightData, 0, result, 3 * sizeOfInt, rightData.Length); Array.Copy(imgData, 0, result, 4 * sizeOfInt, imgData.Length); return result;} |
For now I am hosting the WCF service in a console application. The code for the host is simple enough:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| [STAThread()]static void Main(string[] args){ string myHost = System.Net.Dns.GetHostName(); string myIp = System.Net.Dns.GetHostEntry(myHost).AddressList[1].ToString(); Uri baseAddress = new Uri("http://" + myIp + ":8080/Rlc/RemoteDesktop"); Console.WriteLine("WCF Remote Desktop Server"); Console.WriteLine("========================="); Console.WriteLine(); Console.WriteLine("Initializing server endpoint..."); Console.WriteLine("Listening on: " + baseAddress.ToString()); Console.WriteLine(); ServiceHost myServiceHost = new ServiceHost(typeof(RemoteDesktopService), baseAddress); myServiceHost.Open(); Console.ReadLine(); if (myServiceHost.State != CommunicationState.Closed) { myServiceHost.Close(); }} |
WinForm Client
For the client, I created a simple WinForms application. In the Visual Studio designer it looks like the following:
The dashed rectangle is a picture box. The two text boxes at the bottom are the refresh rate in milliseconds for the screen and cursor. The apply button allows the refresh rates to be updated after the application has been started. The two elements that you cannot see are two timers that are used to trigger the updates for the screen and the cursor.
The first step in connecting to the WCF service is to create a service reference. You just need to fire up the service and let the WSDL to the work. You will then have a proxy for accessing the service methods we defined earlier.
The following code is the implementation for the triggers for the two timers:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
| private void timer1_Tick(object sender, EventArgs e){ byte[] data = svc.UpdateScreenImage(); if (data != null) { // Update the current screen. // Utils.UpdateScreen(ref _screen, data); // Update the UI. // ShowImage(); } else { // screen has not changed }}private void cursorTimer_Tick(object sender, EventArgs e){ byte[] data = svc.UpdateCursorImage(); if (data != null) { // Unpack the data. // Utils.UnpackCursorCaptureData(data, out _cursor, out _cursorX, out _cursorY); } else { _cursor = null; } // Update the UI. // ShowImage();} |
The following shows a screenshot of the client running. Notice that it is capturing itself over and over.
Results
The results so far are decent. When there is not activity on the remote machine, the server does not transfer any data to the client. This is shown below in the server console window.
During activity the number of bytes transferred adapts to the number of pixels changed.
As can be seen in the screens above, the refresh rates range from 2-10 frames per second depending on the number of pixels changed. Take these numbers with some skepticism, since these test were run on my internal network (100MB/s).
I have some more road to travel with this concept before I post the project. I will also try to post a video of the desktop viewer at some point. If you have any suggestions on improvements, please leave a comment.
0 Comments
Good day precious one, We love you more than anything.